Automatic Speech Recognition for Northern Sámi with comparison to other Uralic Languages

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorSmit, Peteren_US
dc.contributor.authorLeinonen, Juhoen_US
dc.contributor.authorJokinen, Kristiinaen_US
dc.contributor.authorKurimo, Mikkoen_US
dc.contributor.departmentDepartment of Signal Processing and Acousticsen
dc.contributor.groupauthorCentre of Excellence in Computational Inference, COINen
dc.contributor.groupauthorSpeech Recognitionen
dc.contributor.organizationUniversity of Helsinkien_US
dc.date.accessioned2017-01-19T10:40:46Z
dc.date.issued2016-01-20en_US
dc.description.abstractSpeech technology applications for major languages are becoming widely available, but for many other languages there is no commercial interest in developing speech technology. As the lack of technology and applications will threaten the existence of these languages, it is important to study how to create speech recognizers with minimal effort and low resources. As a test case, we have developed a Large Vocabulary Continuous Speech Recognizer for Northern Sámi, an Finno-Ugric language that has little resources for speech technology available. Using only limited audio data, 2.5 hours, and the Northern Sámi Wikipedia for the language model we achieved 7.6% Letter Error Rate (LER). With a language model based on a higher quality language corpus we achieved 4.2% LER. To put this in perspective we also trained systems in other, better-resourced, Finno-Ugric languages (Finnish and Estonian) with the same amount of data and compared those to state-of-the-art systems in those languages.en
dc.description.versionPeer revieweden
dc.format.extent11
dc.format.extent80-91
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationSmit, P, Leinonen, J, Jokinen, K & Kurimo, M 2016, Automatic Speech Recognition for Northern Sámi with comparison to other Uralic Languages . in Proceedings of the Second International Workshop on Computational Linguistics for Uralic Languages ., 9, University of Szeged, Szeged, Hungary, pp. 80-91, International Workshop on Computational Linguistics for the Uralic Languages, Szeged, Hungary, 20/01/2016 . < http://rgai.inf.u-szeged.hu/project/iwclul/proceedings.pdf >en
dc.identifier.isbn978-963-306-504-4
dc.identifier.otherPURE UUID: 15232107-a79e-4fbd-ab26-e7824c2484afen_US
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/15232107-a79e-4fbd-ab26-e7824c2484afen_US
dc.identifier.otherPURE LINK: http://rgai.inf.u-szeged.hu/project/iwclul/proceedings.pdfen_US
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/10368745/iwclul2016smit.pdfen_US
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/24164
dc.identifier.urnURN:NBN:fi:aalto-201701191109
dc.language.isoenen
dc.relation.ispartofInternational Workshop on Computational Linguistics for the Uralic Languagesen
dc.relation.ispartofseriesProceedings of the Second International Workshop on Computational Linguistics for Uralic Languagesen
dc.rightsopenAccessen
dc.titleAutomatic Speech Recognition for Northern Sámi with comparison to other Uralic Languagesen
dc.typeA4 Artikkeli konferenssijulkaisussafi
dc.type.versionacceptedVersion

Files