Neural Modelling of Audio Effects

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.advisorVälimäki, Vesa, Prof., Aalto University, Department of Information and Communications Engineering, Finland
dc.contributor.authorWright, Alec
dc.contributor.departmentInformaatio- ja tietoliikennetekniikan laitosfi
dc.contributor.departmentDepartment of Information and Communications Engineeringen
dc.contributor.labAudio Signal Processingen
dc.contributor.schoolSähkötekniikan korkeakoulufi
dc.contributor.schoolSchool of Electrical Engineeringen
dc.contributor.supervisorVälimäki, Vesa, Prof., Aalto University, Department of Information and Communications Engineering, Finland
dc.date.accessioned2023-12-02T10:00:27Z
dc.date.available2023-12-02T10:00:27Z
dc.date.defence2023-12-15
dc.date.issued2023
dc.description.abstractNeural networks and other machine learning based approaches to audio effects processing have become increasingly popular in recent years. This thesis focuses on the design and training of neural network architectures for the emulation of specific analog audio devices from data. The digital emulation of analog audio devices is commonly known as virtual analog, and popular effects processing devices for virtual analog modelling include guitar amplifiers, distortion pedals, time-varying effects, and compressors. Whilst analytical methods based on circuit analysis are capable of producing realistic, efficient and accurate models of devices, these approaches are limited by the fact that creating a model of a specific device is time-consuming and requires expert knowledge. In contrast, neural network based methods allow for greater automation in the modelling process, and can be applied relatively easily to a range of devices as long as sufficient data is available. This thesis proposes a number of neural network based methods for audio effects modelling, and shows that they achieve excellent perceptual emulation quality. The proposed models include convolutional, recurrent and differentiable digital signal processing based architectures. There is a focus on models with low computational cost and low latency, such that they are suitable for real-time processing as part of a music production workflow. Methods for modelling Low-Frequency Oscillator (LFO) modulated time-varying effects, compressors, guitar amplifiers and distortions pedals are proposed. In addition to the neural network architectures themselves, this thesis also provides practical details and methods for training the models. This includes the proposal and validation of a novel perceptually motivated pre-emphasis filter, used to model non-linear audio effects processing. Additionally a pruning method is applied and shown to achieve significant reduction in model size and inference cost for guitar amplifier and distortion effects modelling. Finally, this thesis presents a novel method for the task of modelling non-linear audio effects processing when paired training data is unavailable. This allows for complex non-linear effects processing to be emulated from recordings, whilst requiring no knowledge of the specific devices used to create the recording.en
dc.format.extent54 + app. 86
dc.identifier.isbn978-952-64-1576-5 (electronic)
dc.identifier.isbn978-952-64-1575-8 (printed)
dc.identifier.issn1799-4942 (electronic)
dc.identifier.issn1799-4934 (printed)
dc.identifier.issn1799-4934 (ISSN-L)
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/124687
dc.identifier.urnURN:ISBN:978-952-64-1576-5
dc.language.isoenen
dc.opnReiss, Josh, Prof., Centre for Digital Music, Queen Mary University of London, UK
dc.publisherAalto Universityen
dc.publisherAalto-yliopistofi
dc.relation.haspart[Publication 1]: A. Wright, E.-P. Damskägg and V. Välimäki. Real-time black-box modelling with recurrent neural networks. In Proc. 22nd Int. Conf. Digital Audio Effects (DAFx), Birmingham, UK, Sept. 2019. Full text in Acris/Aaltodoc https://urn.fi/URN:NBN:fi:aalto-201909205347.
dc.relation.haspart[Publication 2]: A. Wright, E.-P Damskägg, L. Juvela and V. Välimäki. Real-time guitar amplifier emulation with deep learning. Applied Sciences, Vol. 10, No. 3, Jan. 2020. Full text in Acris/Aaltodoc https://urn.fi/URN:NBN:fi:aalto-202004092775. DOI: 10.3390/app10030766
dc.relation.haspart[Publication 3]: A. Wright and V. Välimäki. Neural modeling of phaser and flanging effects. J. Audio Eng. Soc., Vol. 69, No. 7/8, pp. 517-529, July 2021. Full text in Acris/Aaltodoc https://urn.fi/URN:NBN:fi:aalto-202109299368. DOI: 10.17743/jaes.2021.0029
dc.relation.haspart[Publication 4]: A. Wright and V. Välimäki. Grey-box modelling of dynamic range compression. In Proc. Int. Conf. on Digital Audio Effects (DAFx 20in22), Vienna, Austria, pp. 304-311, Sept. 2022. Full text in Acris/Aaltodoc https://urn.fi/URN:NBN:fi:aalto-202210196073.
dc.relation.haspart[Publication 5]: A. Wright and V. Välimäki. Perceptual loss function for neural modeling of audio systems. In Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP 2020), Barcelona, Spain, pp. 251-255, May 2020. Full text in Acris/Aaltodoc https://urn.fi/URN:NBN:fi:aalto-202007034257. DOI: 10.1109/ICASSP40776.2020.9052944
dc.relation.haspart[Publication 6]: D. Südholt, A. Wright, C. Erkut and V. Välimäki. Pruning deep neural network models of guitar distortion effects. IEEE/ACM Trans. on Audio, Speech, and Language Processing, Vol.31, pp. 256-264, Nov. 2022. Full text in Acris/Aaltodoc https://urn.fi/URN:NBN:fi:aalto-202212227224. DOI: 10.1109/TASLP.2022.3223257
dc.relation.haspart[Publication 7]: A. Wright, V. Välimäki and L. Juvela. Adversarial guitar amplifier modelling with unpaired data. Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP 2023), Rhodes, Greece, June 2023. Full text in Acris/Aaltodoc https://urn.fi/URN:NBN:fi:aalto-202310116245. DOI: 10.1109/ICASSP49357.2023.10094600
dc.relation.ispartofseriesAalto University publication series DOCTORAL THESESen
dc.relation.ispartofseries217/2023
dc.revProf. Sisman, Berrak, Asst. Prof., Erik Jonsson School of Engineering and Computer Science, University of Texas at Dallas, USA
dc.revMartínez, Marco, Dr., Sony AI, Japan
dc.subject.keywordaudio effects processingen
dc.subject.keyworddeep learningen
dc.subject.keywordneural networksen
dc.subject.keywordnonlinear systemsen
dc.subject.keywordmachine learningen
dc.subject.otherInformation systemsen
dc.titleNeural Modelling of Audio Effectsen
dc.typeG5 Artikkeliväitöskirjafi
dc.type.dcmitypetexten
dc.type.ontasotDoctoral dissertation (article-based)en
dc.type.ontasotVäitöskirja (artikkeli)fi
local.aalto.acrisexportstatuschecked 2023-12-19_1430
local.aalto.archiveyes
local.aalto.formfolder2023_12_01_klo_13_07
local.aalto.infraAalto Acoustics Lab
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
isbn9789526415765.pdf
Size:
4.67 MB
Format:
Adobe Portable Document Format