Character-based units for Unlimited Vocabulary Continuous Speech Recognition
Loading...
Access rights
openAccess
acceptedVersion
URL
Journal Title
Journal ISSN
Volume Title
A4 Artikkeli konferenssijulkaisussa
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Date
Major/Subject
Mcode
Degree programme
Language
en
Pages
Series
Automatic Speech Recognition and Understanding (ASRU), IEEE Workshop on, pp. 149-154
Abstract
We study character-based language models in the state-of-the-art speech recognition framework. This approach has advantages over both word-based systems and so-called end-to-end ASR systems that do not have separate acoustic and language models. We describe the necessary modifications needed to build an effective character-based ASR system using the Kaldi toolkit and evaluate the models based on words, statistical morphs, and characters for both Finnish and Arabic. The morph-based models yield the best recognition results for both well-resourced and lower-resourced tasks, but the character-based models are close to their performance in the lower-resource tasks, outperforming the word-based models. Character-based models are especially good at predicting novel word forms that were not seen in the training data. Using character-based neural network language models is both computationally efficient and provides a larger gain compared to the morph and word-based systems.Description
Keywords
Other note
Citation
Smit, P, Gangireddy, S, Enarvi, S, Virpioja, S & Kurimo, M 2018, Character-based units for Unlimited Vocabulary Continuous Speech Recognition. in Automatic Speech Recognition and Understanding (ASRU), IEEE Workshop on. IEEE, pp. 149-154, IEEE Automatic Speech Recognition and Understanding Workshop, Okinawa, Japan, 16/12/2017. https://doi.org/10.1109/ASRU.2017.8268929