Estimation and Restoration of Unknown Nonlinear Distortion Using Diffusion

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorŠvento, Michal
dc.contributor.authorMoliner Juanpere, Eloi
dc.contributor.authorJuvela, Lauri
dc.contributor.authorWright, Alec
dc.contributor.authorVälimäki, Vesa
dc.contributor.departmentDepartment of Information and Communications Engineeringen
dc.contributor.groupauthorAudio Signal Processingen
dc.contributor.groupauthorSpeech Synthesisen
dc.contributor.organizationUniversity of Edinburgh
dc.date.accessioned2025-10-08T06:55:56Z
dc.date.available2025-10-08T06:55:56Z
dc.date.issued2025-09-01
dc.descriptionPublisher Copyright: © 2025, Audio Engineering Society. All rights reserved.
dc.description.abstractThe restoration of nonlinearly distorted audio signals, alongside the identification of the applied memoryless nonlinear operation, is studied. The paper focuses on the difficult but practically important case in which both the nonlinearity and the original input signal are unknown. The proposed method uses a generative diffusion model trained unconditionally on guitar or speech signals to jointly model and invert the nonlinear system at inference time. Both the memoryless nonlinear function model and the restored audio signal are obtained as output. Examples of successful blind estimation of hard and soft-clipping, digital quantization, half-wave rectification, and wavefolding nonlinearities are presented. The results suggest that, out of the nonlinear functions tested here, the Cubic Catmull-Rom spline is best suited to approximating these nonlinearities. In the case of guitar recordings, comparisons with informed and supervised restoration methods show that the proposed blind method is at least as good as they are in terms of objective metrics. Experiments on distorted speech show that the proposed blind method outperforms general-purpose speech enhancement techniques and restores the original voice quality. The proposed method can be applied to memoryless audio effects modeling, restoration of music and speech recordings, and characterization of analog recording media.en
dc.description.versionPeer revieweden
dc.format.extent14
dc.format.mimetypeapplication/pdf
dc.identifier.citationŠvento, M, Moliner Juanpere, E, Juvela, L, Wright, A & Välimäki, V 2025, 'Estimation and Restoration of Unknown Nonlinear Distortion Using Diffusion', AES: Journal of the Audio Engineering Society, vol. 73, no. 9, pp. 519-532. https://doi.org/10.17743/jaes.2022.0221en
dc.identifier.doi10.17743/jaes.2022.0221
dc.identifier.issn1549-4950
dc.identifier.otherPURE UUID: 6d7b6aa4-f6ac-4849-8de4-a3902ddd59b7
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/6d7b6aa4-f6ac-4849-8de4-a3902ddd59b7
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/198063457/Estimation_and_Restoration_of_Unknown_Nonlinear_Distortion_Using_Diffusion.pdf
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/139612
dc.identifier.urnURN:NBN:fi:aalto-202510087793
dc.language.isoenen
dc.publisherAudio Engineering Society
dc.relation.fundinginfoThis study was conducted during a 4-month Erasmus+ traineeship of the first author at the Aalto Acoustics Lab from in August to December 2024. The work of the first author was also supported by the Czech Science Foundation (GAČR) Project No. 23-07294S. The authors acknowledge the computational resources provided by the Aalto Science-IT project. The authors are grateful to Professor Pavel Rajmic for his support and helpful discussions.
dc.relation.ispartofseriesAES: Journal of the Audio Engineering Societyen
dc.relation.ispartofseriesVolume 73, issue 9, pp. 519-532en
dc.rightsopenAccessen
dc.titleEstimation and Restoration of Unknown Nonlinear Distortion Using Diffusionen
dc.typeA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessäfi
dc.type.versionpublishedVersion

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Estimation_and_Restoration_of_Unknown_Nonlinear_Distortion_Using_Diffusion.pdf
Size:
1.31 MB
Format:
Adobe Portable Document Format