Browsing by Author "Schlecht, Sebastian Jiro"
Now showing 1 - 1 of 1
- Results Per Page
- Sort Options
- Loudspeaker Modelling with Recurrent Neural Networks
Sähkötekniikan korkeakoulu | Master's thesis(2023-08-21) Kerimovs, TeodorsDigital twins of loudspeakers are a useful assets for fine-tuning purposes during the design and the manufacturing phase. They can serve as an alternative to real-time measurement for objective evaluation of adjustments made by digital signal processing. Binaural loudspeaker models could introduce a more repeatable framework for subjective listening and provide flexibility for remote work due to the reduced need for actual physical devices. Neural Networks are a well-proven tool for system identification of different audio hardware devices. This thesis project will focus on creating a digital twin of a multimedia stereo loudspeaker system by using stereo audio waveform as the input and a binaural recording of the system's playback as the target waveform for Recurrent Neural Network (RNN) training. The RNN architecture is inspired by the current state-of-the-art method for single channel audio effects modelling, and is adapted for the stereo waveform use case. Firstly, the RNN model is tested with different synthesized target data that simulates the real recorded data. This approach allows us to estimate the properties which are the most challenging for the RNN to learn. Secondly, the experiments are run with a real recorded, time-aligned dataset, and the RNN's performance is objectively evaluated by the Error-To-Signal Ratio (ESR). In the current state-of-the-art method on single channel audio modelling, the initial hidden state of the RNN is computed by using no-gradient startup inference to accumulate the hidden state over the first few hundred samples of the training sequence. The thesis project proposes a new method called Discontinuous Sequence Training (DISCO). The method prepares the training dataset according to the RNNs architecture’s hyper-parameter sequence length and the system's impulse response length, such that it allows for correct initialization of the initial hidden state without additional pre-training inference. DISCO reaches the training and inference precision of hidden state initialization in the current state-of-the-art method for black-box modelling with RNNs only by modifying the dataset.