State-Space Virtual Analogue Modelling of Audio Circuits

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

Sähkötekniikan korkeakoulu | Master's thesis

Department

Mcode

ELEC3030

Language

en

Pages

73

Series

Abstract

This thesis investigates the use of modern deep learning architectures within the grey-box domain of virtual analogue modelling of audio devices. The aim of this work is to investigate the efficacy of including internal system states within a recurrent neural network structure. To accomplish this a modification based on two existing neural network models was proposed so that it directly maps its own internal hidden states to the system states. In order to compare its performance it was tested alongside the two state of the art approaches it was based on, one from the black-box domain and one from the grey-box domain, across three nonlinear analogue audio devices. From this comparison the proposed method was analysed in terms of its accuracy with time-domain views, spectral analysis, perceptual tests, and numerical error metrics. Since the proposed model lies between the two approaches it was compared to, this analysis allowed for a decomposition of the most significant benefits obtained from either side of modelling approaches. The results demonstrated that both of the state-space approaches can provide benefits when emulating complex self-oscillating devices but can struggle to match the performance of the existing black-box model for more input dependent nonlinear devices. Furthermore, throughout the results it was demonstrated that the changes made from the previous state of the art grey-box model aided the proposed structure across the modelling tasks. These were seen in the fact that the proposed architecture offered more training stability, less parameter initialisation variance, improved hyperparameter robustness, and an overall improvement in inference accuracy. Moreover, the inclusion of the system state-space to the proposed model offered benefits in terms of relative training time when compared to its black-box counterpart, since it performed equivalently or better with each device when using shorter sequential predictions.

Description

Supervisor

Välimäki, Vesa

Thesis advisor

Damskägg, Eero-Pekka

Other note

Citation