Finnish number recognition system for a language learning application

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.advisorPhan, Nhan
dc.contributor.authorMai, Hoang
dc.contributor.schoolPerustieteiden korkeakoulufi
dc.contributor.schoolSchool of Scienceen
dc.contributor.supervisorKurimo, Mikko
dc.date.accessioned2025-10-21T17:00:16Z
dc.date.available2025-10-21T17:00:16Z
dc.date.issued2025-09-06
dc.description.abstractMastery of number pronunciation is essential for functional fluency in a new language, yet dedicated computer-assisted tools for honing this crucial skill remain scarce, particularly for a morphologically complex and low-resource language like Finnish. To address this gap, this thesis develops an end-to-end speech recognition model tailored for transcribing spoken Finnish numbers into numerical form, achieved by fine-tuning a Wav2vec 2.0 model on a dataset of number utterances derived from the Finnish Donate Speech (Lahjoita puhetta) corpus. The resulting model achieves a balanced accuracy of 0.78 on spoken number transcription. Furthermore, a detailed error analysis was conducted to identify mistranscription patterns and locate specific numerical ranges where the model struggles, thereby providing insights about its usability and possible future improvements. In practical application, the model was deployed as a server-side API for SaySuomi Finnish learning application, where it repurposes CTC-segmentation to offer users real-time feedback on number pronunciation. Overall, the study demonstrates the viability of applying fine-tuned speech recognition models to specialized pedagogical tasks and contributes a tangible tool to the landscape of Finnish computer-assisted language learning applications.en
dc.format.extent50
dc.format.mimetypeapplication/pdfen
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/140241
dc.identifier.urnURN:NBN:fi:aalto-202510218409
dc.language.isoenen
dc.programmeMaster's Programme in Computer, Communication and Information Sciencesen
dc.programme.majorMachine Learning, Data Science and Artificial Intelligenceen
dc.subject.keywordspeech recognitionen
dc.subject.keywordpronunciation evaluationen
dc.subject.keywordWav2vec 2.0en
dc.subject.keywordcomputer-assisted pronunciation trainingen
dc.subject.keywordnumber transcriptionen
dc.subject.keywordend-to-enden
dc.titleFinnish number recognition system for a language learning applicationen
dc.typeG2 Pro gradu, diplomityöfi
dc.type.ontasotMaster's thesisen
dc.type.ontasotDiplomityöfi
local.aalto.electroniconlyyes
local.aalto.openaccessno

Files