Postfiltering Using Log-Magnitude Spectrum for Speech and Audio Coding

Loading...
Thumbnail Image
Journal Title
Journal ISSN
Volume Title
Conference article in proceedings
This publication is imported from Aalto University research portal.
View publication in the Research portal
View/Open full text file from the Research portal
Date
2018-09
Major/Subject
Mcode
Degree programme
Language
en
Pages
3543-3547
Series
Interspeech
Abstract
Advanced coding algorithms yield high quality signals with good coding efficiency within their target bit-rate ranges, but their performance suffer outside the target range. At lower bitrates, the degradation in performance is because the decoded signals are sparse, which gives a perceptually muffled and distorted characteristic to the signal. Standard codecs reduce such distortions by applying noise filling and post-filtering methods. In this paper, we propose a post-processing method based on modeling the inherent time-frequency correlation in the log-magnitude spectrum. The goal is to improve the perceptual SNR of the decoded signals and, to reduce the distortions caused by signal sparsity. Objective measures show an average improvement of 1.5 dB for input perceptual SNR in range 4 to 18 dB. The improvement is especially prominent in components which had been quantized to zero.
Description
Keywords
Other note
Citation
Das , S & Bäckström , T 2018 , Postfiltering Using Log-Magnitude Spectrum for Speech and Audio Coding . in Interspeech : Annual Conference of the International Speech Communication Association . , 1027 , Interspeech , International Speech Communication Association (ISCA) , pp. 3543-3547 , Interspeech , Hyderabad , India , 02/09/2018 . https://doi.org/10.21437/Interspeech.2018-1027