End-to-End Optimized Multi-Stage Vector Quantization of Spectral Envelopes for Speech and Audio Coding

Loading...
Thumbnail Image

Access rights

openAccess
publishedVersion

URL

Journal Title

Journal ISSN

Volume Title

A4 Artikkeli konferenssijulkaisussa

Date

2021-09

Major/Subject

Mcode

Degree programme

Language

en

Pages

5

Series

22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021, pp. 2728-2732, Annual Conference of the International Speech Communication Association

Abstract

Spectral envelope modeling is an instrumental part of speech and audio codecs, which can be used to enable efficient entropy coding of spectral components. Overall optimization of codecs, including envelope models, has however been difficult due to the complicated interactions between different modules of the codec. In this paper, we study an end-to-end optimization methodology to optimize all modules in a codec integrally with respect to each other while capturing all these complex interactions with a global loss function. For the quantization of the spectral envelope parameters with a fixed bitrate, we use multistage vector quantization which gives high quality, but yet has a computational complexity which can be realistically applied in embedded devices. The obtained results demonstrate benefits in terms of PESQ and PSNR in comparison to the 3GPP EVS, as well as our recently proposed PyAWNeS codecs.

Description

Keywords

Other note

Citation

Vali, M & Bäckström, T 2021, End-to-End Optimized Multi-Stage Vector Quantization of Spectral Envelopes for Speech and Audio Coding . in 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021 . Annual Conference of the International Speech Communication Association, International Speech Communication Association (ISCA), pp. 2728-2732, Interspeech, Brno, Czech Republic, 30/08/2021 . https://doi.org/10.21437/Interspeech.2021-867