Automatic Construction of the Finnish Parliament Speech Corpus

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.author Mansikkaniemi, Andre
dc.contributor.author Smit, Peter
dc.contributor.author Kurimo, Mikko
dc.date.accessioned 2017-10-15T20:54:41Z
dc.date.available 2017-10-15T20:54:41Z
dc.date.issued 2017-08
dc.identifier.citation Mansikkaniemi , A , Smit , P & Kurimo , M 2017 , Automatic Construction of the Finnish Parliament Speech Corpus . in Interspeech 2017 . pp. 3762-3766 . DOI: 10.21437/Interspeech.2017-1115 en
dc.identifier.other PURE UUID: b1ffce87-0beb-4c3c-89e5-a0adfdbaf1f6
dc.identifier.other PURE ITEMURL: https://research.aalto.fi/en/publications/automatic-construction-of-the-finnish-parliament-speech-corpus(b1ffce87-0beb-4c3c-89e5-a0adfdbaf1f6).html
dc.identifier.other PURE FILEURL: https://research.aalto.fi/files/15742470/mansikkamaki_interspeech1115.pdf
dc.identifier.uri https://aaltodoc.aalto.fi/handle/123456789/28277
dc.description.abstract Automatic speech recognition (ASR) systems require large amounts of transcribed speech data, for training state-of-the-art deep neural network (DNN) acoustic models. Transcribed speech is a scarce and expensive resource, and ASR systems are prone to underperform in domains where there is not a lot of training data available. In this work, we open up a vast and previously unused resource of transcribed speech for Finnish, by retrieving and aligning all the recordings and meeting transcripts from the web portal of the Parliament of Finland. Short speech-text segment pairs are retrieved from the audio and text material, by using the Levenshtein algorithm to align the first-pass ASR hypotheses with the corresponding meeting transcripts. DNN acoustic models are trained on the automatically constructed corpus, and performance is compared to other models trained on a commercially available speech corpus. Model performance is evaluated on Finnish parliament speech, by dividing the testing set into seen and unseen speakers. Performance is also evaluated on broadcast speech to test the general applicability of the parliament speech corpus. We also study the use of meeting transcripts in language model adaptation, to achieve additional gains in speech recognition accuracy of Finnish parliament speech. en
dc.format.extent 3762-3766
dc.format.mimetype application/pdf
dc.language.iso en en
dc.relation.ispartofseries Interspeech 2017 en
dc.rights openAccess en
dc.subject.other 113 Computer and information sciences en
dc.subject.other 213 Electronic, automation and communications engineering, electronics en
dc.title Automatic Construction of the Finnish Parliament Speech Corpus en
dc.type A4 Artikkeli konferenssijulkaisussa fi
dc.description.version Peer reviewed en
dc.contributor.department Department of Signal Processing and Acoustics
dc.subject.keyword automatic speech recognition
dc.subject.keyword speech-to-text alignment
dc.subject.keyword DNN acoustic models
dc.subject.keyword parliament speech dat
dc.subject.keyword transcribed speech corpus
dc.subject.keyword 113 Computer and information sciences
dc.subject.keyword 213 Electronic, automation and communications engineering, electronics
dc.identifier.urn URN:NBN:fi:aalto-201710157137
dc.identifier.doi 10.21437/Interspeech.2017-1115
dc.type.version acceptedVersion


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search archive


Advanced Search

article-iconSubmit a publication

Browse

My Account