Using stacked transformations for recognizing foreign accented speech

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.author Smit, Peter
dc.contributor.author Kurimo, Mikko
dc.date.accessioned 2013-12-13T10:00:23Z
dc.date.available 2013-12-13T10:00:23Z
dc.date.issued 2011
dc.identifier.citation Smit, Peter & Kurimo, Mikko. 2011. Using stacked transformations for recognizing foreign accented speech. ICASSP Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on. ISSN 1520-6149 (printed). ISBN 978-1-4577-0537-3 (electronic). ISBN 978-1-4577-0538-0 (printed). DOI: 10.1109/ICASSP.2011.5947481. en
dc.identifier.isbn 978-1-4577-0537-3 (electronic)
dc.identifier.isbn 978-1-4577-0538-0 (printed)
dc.identifier.issn 1520-6149 (printed)
dc.identifier.uri https://aaltodoc.aalto.fi/handle/123456789/11838
dc.description.abstract A common problem in speech recognition for foreign accented speech is that there is not enough training data for an accent-specific or a speaker-specific recognizer. Speaker adaptation can be used to improve the accuracy of a speaker independent recognizer, but a lot of adaptation data is needed for speakers with a strong foreign accent. In this paper we propose a rather simple and successful technique of stacked transformations where the baseline models trained for native speakers are first adapted by using accent-specific data and then by another transformation using speaker-specific data. Because the accent-specific data can be collected offline, the first transformation can be more detailed and comprehensive, and the second one less detailed and fast. Experimental results are provided for speaker adaptation in English spoken by Finnish speakers. The evaluation results confirm that the stacked transformations are very helpful for fast speaker adaptation. en
dc.format.extent 4
dc.format.mimetype application/pdf
dc.language.iso en en
dc.publisher IEEE en
dc.relation.ispartof ICASSP Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on en
dc.subject.other Computer science en
dc.title Using stacked transformations for recognizing foreign accented speech en
dc.type A4 Artikkeli konferenssijulkaisussa fi
dc.description.version Peer reviewed
dc.contributor.school Perustieteiden korkeakoulu fi
dc.contributor.school School of Science en
dc.contributor.department Tietojenkäsittelytieteen laitos fi
dc.contributor.department Department of Information and Computer Science en
dc.subject.keyword automatic speech recognition en
dc.subject.keyword foreign-accent recognition en
dc.subject.keyword cmllr transformation en
dc.subject.keyword stacked transformations en
dc.identifier.urn URN:ISBN:978-1-4577-0537-3
dc.type.dcmitype text en
dc.identifier.doi 10.1109/ICASSP.2011.5947481
dc.type.version Post print


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search archive


Advanced Search

article-iconSubmit a publication

Browse

My Account