Traditional methods to study sound propagation inside rooms can be divided in two approaches: geometrical models and wave-based models. In the former, sound is analyzed as rays, giving a valid approximation for high frequencies while failing to model certain wave effects such as diffraction or inference. The latter, finds solutions for the wave equation, providing better accuracy at the cost of much higher computational complexity.
This thesis presents a proof of concept for a novel machine learning method to estimate a set of typical room acoustics parameters using only geometrical information as input features. First, a room acoustics dataset composed of real world acoustical measurements is analyzed and processed using microphone array encoding techniques to extract room impulse responses and acoustical absorption area for multiple directions. The dataset is explored to identify correlation between features and general properties, including a low dimensionality representation for visualization.
The proposed method uses geometrical features as input for a neural network model that estimates room acoustics parameters, such as reverberation time (T60), and early decay time (EDT). For reverberation time, this model is evaluated against the Sabine method and the results show much higher accuracy, especially at low frequencies. The method is then expanded to include input features for the locations of the source and microphone, where the results also achieve high performance.
Furthermore, an hyperparameter optimization procedure using random search reveals three main findings. First, that a large range of neural networks architectures, even with very few trainable parameters, achieve high performance. Second, the depth of the models has little influence on the results. Third, the benefit of increasing the amount of training data examples for a single loudspeaker saturates after around 100 examples.