What happens in continued pre-training? Analysis of self-supervised speech models with continued pre-training for colloquial Finnish ASR
Loading...
Access rights
openAccess
publishedVersion
URL
Journal Title
Journal ISSN
Volume Title
A4 Artikkeli konferenssijulkaisussa
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
Date
2024
Major/Subject
Mcode
Degree programme
Language
en
Pages
5
Series
Interspeech 2024, pp. 5043-5047, Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Abstract
The advancement of self-supervised learning has enabled the rapid development of highly accurate speech recognition models, such as wav2vec 2.0, for many languages. While high-resourced languages like English benefit from purely monolingual models, other, less-resourced ones must build upon multilingual foundations. In this work, we investigate various strategies to specialize models for the colloquial Finnish language and demonstrate that continued pre-training of available multilingual models is the best solution. Furthermore, we investigate the success of the pre-training procedure by examining the learned quantized representations and show how the continued pre-training improved the discovered latent codeword groups.Description
Publisher Copyright: © 2024 International Speech Communication Association. All rights reserved.
Keywords
ASR, continued pre-training, quantized representations, wav2vec2
Other note
Citation
Getman, Y, Grósz, T & Kurimo, M 2024, What happens in continued pre-training? Analysis of self-supervised speech models with continued pre-training for colloquial Finnish ASR . in Interspeech 2024 . Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, International Society for Computers and Their Applications (ISCA), pp. 5043-5047, Interspeech, Kos Island, Greece, 01/09/2024 . https://doi.org/10.21437/Interspeech.2024-476