Time-Frequency Audio Similarity Using Optimal Transport
Loading...
Access rights
openAccess
acceptedVersion
URL
Journal Title
Journal ISSN
Volume Title
A4 Artikkeli konferenssijulkaisussa
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Date
2025-04-04
Major/Subject
Mcode
Degree programme
Language
en
Pages
4
Series
2024 58th Asilomar Conference on Signals, Systems, and Computers, pp. 1414-1417, Asilomar Conference on Signals, Systems, and Computers
Abstract
In audio signal processing, having an effective metric for comparing audio data is essential to ensure an accurate understanding of sound properties and attributes. In this work, we formulate two novel approaches for measuring the similarity between audio signals in the time-frequency domain, taking advantage of principles from classical optimal transport problems and sliced Wasserstein distances. Using optimal transport to construct the metric allows for a more robust signal content comparison, considering not only the signals' individual elements but also the global distribution in the signal space. Additionally, the sliced Wasserstein methods expand the use of the distances to high dimensional problems. By integrating both time and frequency aspects into our metrics, we aim for a more comprehensive comparison that can better handle various types of signal distortions. Results show promising behavior in accurately measuring distances for increasing signal differences and avoiding the presence of local minima in the loss curves.Description
Keywords
Costs, Distortion, Distortion measurement, Loss measurement, MIMICs, Machine listening, Market research, Optimization, Spectrogram, Time-frequency analysis, similarity measure, optimal transport, optimization, audio-to-audio distance, Wasserstein distance
Other note
Citation
Fabiani, L, Schlecht, S J & Elvander, F 2025, Time-Frequency Audio Similarity Using Optimal Transport . in M B Matthews (ed.), 2024 58th Asilomar Conference on Signals, Systems, and Computers ., 10943074, Asilomar Conference on Signals, Systems, and Computers, IEEE, pp. 1414-1417, Asilomar Conference on Signals, Systems and Computers, Pacific Grove, California, United States, 27/10/2024 . https://doi.org/10.1109/IEEECONF60004.2024.10943074