Automatic Audio Equalization with Semantic Embeddings

Loading...
Thumbnail Image

Access rights

openAccess
acceptedVersion

URL

Journal Title

Journal ISSN

Volume Title

A4 Artikkeli konferenssijulkaisussa

Major/Subject

Mcode

Degree programme

Language

en

Pages

10

Series

[Proceedings of the AES International Conference on Machine Learning and Artificial Intelligence for Audio], Journal of the Audio Engineering Society

Abstract

This paper presents a data-driven approach to automatic blind equalization of audio by predicting log-mel spectral features and deriving an inverse filter. The method uses a deep neural network, where a pre-trained model provides semantic embeddings as a backbone, and only a lightweight head is trained. This design is intended to enhance training efficiency and generalization. Trained on both music and speech, the model is robust to noise and reverberation. Objective evaluations confirm its effectiveness, and subjective tests show performance comparable to that of an oracle that uses true log-mel spectral features, indicating that the model accurately estimates the desired characteristics, with remaining limitations attributed to the filtering stage. Overall, the results highlight the potential of the method for real-world audio enhancement applications.

Description

Keywords

Other note

Citation

Moliner Juanpere, E, Välimäki, V, Drossos, K & Hämäläinen, M 2025, Automatic Audio Equalization with Semantic Embeddings. in [Proceedings of the AES International Conference on Machine Learning and Artificial Intelligence for Audio]. Journal of the Audio Engineering Society, Audio Engineering Society, AES International Conference on Machine Learning and Artificial Intelligence for Audio, London, United Kingdom, 08/09/2025. < https://aes2.org/publications/elibrary-page/?id=22996 >