aalto1 untyped-item.component.html

Finnish number recognition system for a language learning application

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

School of Science | Master's thesis
Electronic archive copy is available via Aalto Thesis Database.

Department

Mcode

Language

en

Pages

50

Series

Abstract

Mastery of number pronunciation is essential for functional fluency in a new language, yet dedicated computer-assisted tools for honing this crucial skill remain scarce, particularly for a morphologically complex and low-resource language like Finnish. To address this gap, this thesis develops an end-to-end speech recognition model tailored for transcribing spoken Finnish numbers into numerical form, achieved by fine-tuning a Wav2vec 2.0 model on a dataset of number utterances derived from the Finnish Donate Speech (Lahjoita puhetta) corpus. The resulting model achieves a balanced accuracy of 0.78 on spoken number transcription. Furthermore, a detailed error analysis was conducted to identify mistranscription patterns and locate specific numerical ranges where the model struggles, thereby providing insights about its usability and possible future improvements. In practical application, the model was deployed as a server-side API for SaySuomi Finnish learning application, where it repurposes CTC-segmentation to offer users real-time feedback on number pronunciation. Overall, the study demonstrates the viability of applying fine-tuned speech recognition models to specialized pedagogical tasks and contributes a tangible tool to the landscape of Finnish computer-assisted language learning applications.

Description

Supervisor

Kurimo, Mikko

Thesis advisor

Phan, Nhan

Other note

Citation

Endorsement

Review

Supplemented By

Referenced By