Self-Supervised Learning for Colloquial Finnish
Loading...
URL
Journal Title
Journal ISSN
Volume Title
Perustieteiden korkeakoulu |
Master's thesis
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Authors
Date
2024-01-22
Department
Major/Subject
Machine Learning, Data Science and Artificial Intelligence
Mcode
SCI3044
Degree programme
Master’s Programme in Computer, Communication and Information Sciences
Language
en
Pages
53
Series
Abstract
Self-supervised learning has shown great results for automatic speech recognition (ASR) systems as they utilize untranscribed speech data in their training, learning speech representations from raw audio data. In this work, the aim was to create a self-supervised speech model for colloquial Finnish and compare the results of pretraining from scratch and continued pretraining. To achieve this, three different models following the base Wav2Vec 2.0 architecture were pretrained. One model was pretrained from scratch while the other two were continuously pretrained. For the continued pretraining experiments, a monolingual Finnish model and a multilingual model were used. The experiments showed similar results for the model pretrained from scratch and the continuously pretrained monolingual Finnish model, with both yielding promising results. The model pretrained continuously from the monolingual Finnish model slightly outperformed the model pretrained from scratch, obtaining a word error rate of 28.1% and a character error rate of 7.5%.Description
Supervisor
Kurimo, MikkoThesis advisor
Getman, YaroslavKeywords
machine learning, automatic speech recognition, self-supervised learning, Wav2Vec 2.0