Finnish number recognition system for a language learning application
| dc.contributor | Aalto-yliopisto | fi |
| dc.contributor | Aalto University | en |
| dc.contributor.advisor | Phan, Nhan | |
| dc.contributor.author | Mai, Hoang | |
| dc.contributor.school | Perustieteiden korkeakoulu | fi |
| dc.contributor.school | School of Science | en |
| dc.contributor.supervisor | Kurimo, Mikko | |
| dc.date.accessioned | 2025-10-21T17:00:16Z | |
| dc.date.available | 2025-10-21T17:00:16Z | |
| dc.date.issued | 2025-09-06 | |
| dc.description.abstract | Mastery of number pronunciation is essential for functional fluency in a new language, yet dedicated computer-assisted tools for honing this crucial skill remain scarce, particularly for a morphologically complex and low-resource language like Finnish. To address this gap, this thesis develops an end-to-end speech recognition model tailored for transcribing spoken Finnish numbers into numerical form, achieved by fine-tuning a Wav2vec 2.0 model on a dataset of number utterances derived from the Finnish Donate Speech (Lahjoita puhetta) corpus. The resulting model achieves a balanced accuracy of 0.78 on spoken number transcription. Furthermore, a detailed error analysis was conducted to identify mistranscription patterns and locate specific numerical ranges where the model struggles, thereby providing insights about its usability and possible future improvements. In practical application, the model was deployed as a server-side API for SaySuomi Finnish learning application, where it repurposes CTC-segmentation to offer users real-time feedback on number pronunciation. Overall, the study demonstrates the viability of applying fine-tuned speech recognition models to specialized pedagogical tasks and contributes a tangible tool to the landscape of Finnish computer-assisted language learning applications. | en |
| dc.format.extent | 50 | |
| dc.format.mimetype | application/pdf | en |
| dc.identifier.uri | https://aaltodoc.aalto.fi/handle/123456789/140241 | |
| dc.identifier.urn | URN:NBN:fi:aalto-202510218409 | |
| dc.language.iso | en | en |
| dc.programme | Master's Programme in Computer, Communication and Information Sciences | en |
| dc.programme.major | Machine Learning, Data Science and Artificial Intelligence | en |
| dc.subject.keyword | speech recognition | en |
| dc.subject.keyword | pronunciation evaluation | en |
| dc.subject.keyword | Wav2vec 2.0 | en |
| dc.subject.keyword | computer-assisted pronunciation training | en |
| dc.subject.keyword | number transcription | en |
| dc.subject.keyword | end-to-end | en |
| dc.title | Finnish number recognition system for a language learning application | en |
| dc.type | G2 Pro gradu, diplomityö | fi |
| dc.type.ontasot | Master's thesis | en |
| dc.type.ontasot | Diplomityö | fi |
| local.aalto.electroniconly | yes | |
| local.aalto.openaccess | no |