Continuous Unsupervised Topic Adaptation for Morph-based Speech Recognition

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en Mansikkaniemi, André 2017-02-09T10:00:34Z 2017-02-09T10:00:34Z 2017
dc.identifier.isbn 978-952-60-7252-4 (electronic)
dc.identifier.isbn 978-952-60-7253-1 (printed)
dc.identifier.issn 1799-4942 (electronic)
dc.identifier.issn 1799-4934 (printed)
dc.identifier.issn 1799-4934 (ISSN-L)
dc.description.abstract Modern automatic speech recognition (ASR) systems are speaker independent and designed to recognize continuous large vocabulary speech. The key components of an ASR system are the acoustic model, language model, lexicon and decoder. A constant challenge for an ASR system over time, is how to adapt to changing topics and the introduction of new names and words. Enabling continuous topic adaptation for ASR systems requires finding new relevant text sources for adapting the language model and identifying words which need new and modified pronunciation rules. In this thesis, unsupervised methods that enable continuous topic adaptation for a Finnish morph-based ASR system are studied. Based on first-pass ASR output, topic and time relevant text data is retrieved from a collection of pre-indexed Web texts. Adapting the background language model with the best matching texts improves recognition accuracy. The recognition accuracy of foreign names and acronyms, one of the focus areas in this thesis, is also improved. Further improvement is achieved by identifying foreign names and acronyms in the retrieved texts, and generating adapted pronunciation rules for them. In statistical morph-based ASR, words are sometimes oversegmented. To enable a more reliable and easier mapping of adapted pronunciation rules, oversegmented foreign names and acronyms are restored back into their base forms. Morpheme restoration also improves recognition accuracy slightly. User feedback is also explored in this thesis for enabling ongoing lexicon adaptation of ASR systems. Based on user corrections of ASR output, optimal pronunciation rules for mis-recognized words are recovered by using forced alignment and Viterbi decoding. A collection of recovered pronunciation rules can be used for the recognition of new speech data. Experiments showed some minor improvements in the recognition of foreign names using user feedback based lexicon adaptation. en
dc.format.extent 99 + app. 83
dc.format.mimetype application/pdf en
dc.language.iso en en
dc.publisher Aalto University en
dc.publisher Aalto-yliopisto fi
dc.relation.ispartofseries Aalto University publication series DOCTORAL DISSERTATIONS en
dc.relation.ispartofseries 10/2017
dc.relation.haspart [Publication 1]: André Mansikkaniemi and Mikko Kurimo. Unsupervised Vocabulary Adaptation for Morph-based Language Models. In NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT, pages 37–40, Montréal, Canada, June 2012.
dc.relation.haspart [Publication 2]: André Mansikkaniemi and Mikko Kurimo. Adaptation of morph-based Speech Recognition for Foreign Entity Names. In Fifth International Conference Human Language Technologies - The Baltic Perspective, pages 129–137, Tartu, Estonia, October 2012. DOI: 10.3233/978-1-61499-133-5-129
dc.relation.haspart [Publication 3]: André Mansikkaniemi and Mikko Kurimo. Unsupervised Topic Adaptation for Morph-based Speech Recognition. In Interspeech 2013, pages 2693–2697, Lyon, France, September 2013.
dc.relation.haspart [Publication 4]: André Mansikkaniemi and Mikko Kurimo. Adaptation of Morph-Based Speech Recognition for Foreign Names and Acronyms. IEEE/ACM Transactions on Audio, Speech, and Language Processing, pages 941–950, vol. 23, no. 5, May 2015. DOI: 10.1109/TASLP.2015.2414818
dc.relation.haspart [Publication 5]: André Mansikkaniemi and Mikko Kurimo. Unsupervised and User Feedback Based Lexicon Adaptation for Foreign Names and Acronyms. In Third International Conference on Statistical Language and Speech Processing, SLSP 2015, Volume 9449, pp. 197-206, Budapest, Hungary, November 2015. DOI: 10.1007/978-3-319-25789-1_19
dc.relation.haspart [Publication 6]: Mikko Kurimo and Seppo Enarvi and Ottokaar Tilk and Matti Varjokallio and André Mansikkaniemi and Tanel Alumäe. Modeling Underresourced Languages for Speech Recognition. Language Resources and Evaluation, pages 1-27, February 2016. DOI: 10.1007/s10579-016-9336-9
dc.subject.other Computer science en
dc.subject.other Electrical engineering en
dc.subject.other Acoustics en
dc.title Continuous Unsupervised Topic Adaptation for Morph-based Speech Recognition en
dc.type G5 Artikkeliväitöskirja fi Sähkötekniikan korkeakoulu fi School of Electrical Engineering en
dc.contributor.department Signaalinkäsittelyn ja akustiikan laitos fi
dc.contributor.department Department of Signal Processing and Acoustics en
dc.subject.keyword morph-based speech recognition en
dc.subject.keyword retrieval-based language model adaptation en
dc.subject.keyword lexicon adaptation en
dc.subject.keyword user feedback based adaptation en
dc.subject.keyword foreign proper name detection en
dc.subject.keyword morph restoration en
dc.identifier.urn URN:ISBN:978-952-60-7252-4
dc.type.dcmitype text en
dc.type.ontasot Doctoral dissertation (article-based) en
dc.type.ontasot Väitöskirja (artikkeli) fi
dc.contributor.supervisor Kurimo, Mikko, Prof., Aalto University, Department of Signal Processing and Acoustics, Finland
dc.opn Svendsen, Torbjørn, Prof., Norwegian University of Science and Technology, Norway
dc.contributor.lab Speech Recognition Research Group en
dc.rev Lagus, Krista , Dr., University of Helsinki, Finland
dc.rev Mihajlik, Peter, Dr., Budapest University of Technology and Economics, Hungary 2017-02-17

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search archive

Advanced Search

article-iconSubmit a publication


My Account