Acoustic Model Compression with MAP adaptation

Loading...
Thumbnail Image
Journal Title
Journal ISSN
Volume Title
Conference article in proceedings
This publication is imported from Aalto University research portal.
View publication in the Research portal
View/Open full text file from the Research portal
Date
2017
Major/Subject
Mcode
Degree programme
Language
en
Pages
5
65-69
Series
Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22-24 May 2017, Gothenburg, Sweden, Linköping Electronic Conference Proceedings, Volume 131
Abstract
Speaker adaptation is an important step in optimization and personalization of the performance of automatic speech recognition (ASR) for individual users. While many applications target in rapid adaptation by various global transformations, slower adaptation to obtain a higher level of personalization would be useful for many active ASR users, especially for those whose speech is not recognized well. This paper studies the outcome of combinations of maximum a posterior (MAP) adaptation and compression of Gaussian mixture models. An important result that has not received much previous attention is how MAP adaptation can be utilized to radically decrease the size of the models as they get tuned to a particular speaker. This is particularly relevant for small personal devices which should provide accurate recognition in real-time despite a low memory, computation, and electricity consumption. With our method we are able to decrease the model complexity with MAP adaptation while increasing the accuracy.
Description
Keywords
MAP adaptation, acoustic model adaptation, Speech recognition, Compression, acoustic model compression, Speaker adaptation
Other note
Citation
Leino , K & Kurimo , M 2017 , Acoustic Model Compression with MAP adaptation . in J Tiedemann (ed.) , Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22-24 May 2017, Gothenburg, Sweden . Linköping Electronic Conference Proceedings , vol. 131 , Linköping University Electronic Press , pp. 65-69 , Nordic Conference on Computational Linguistics , Gothenburg , Sweden , 22/05/2017 .