Aalto system for the 2017 Arabic multi-genre broadcast challenge

Loading...
Thumbnail Image
Journal Title
Journal ISSN
Volume Title
Conference article in proceedings
This publication is imported from Aalto University research portal.
View publication in the Research portal
View/Open full text file from the Research portal
Date
2018
Major/Subject
Mcode
Degree programme
Language
en
Pages
338-345
Series
Automatic Speech Recognition and Understanding (ASRU), IEEE Workshop on
Abstract
We describe the speech recognition systems we have created for MGB-3, the 3rd Multi Genre Broadcast challenge, which this year consisted of a task of building a system for transcribing Egyptian Dialect Arabic speech, using a big audio corpus of primarily Modern Standard Arabic speech and only a small amount (5 hours) of Egyptian adaptation data. Our system, which was a combination of different acoustic models, language models and lexical units, achieved a Multi-Reference Word Error Rate of 29.25%, which was the lowest in the competition. Also on the old MGB-2 task, which was run again to indicate progress, we achieved the lowest error rate: 13.2%. The result is a combination of the application of state-of-the-art speech recognition methods such as simple dialect adaptation for a Time-Delay Neural Network (TDNN) acoustic model (-27% errors compared to the baseline), Recurrent Neural Network Language Model (RNNLM) rescoring (an additional -5%), and system combination with Minimum Bayes Risk (MBR) decoding (yet another -10%). We also explored the use of morph and character language models, which was particularly beneficial in providing a rich pool of systems for the MBR decoding.
Description
Keywords
Other note
Citation
Smit, P, Gangireddy, S, Enarvi, S, Virpioja, S & Kurimo, M 2018, Aalto system for the 2017 Arabic multi-genre broadcast challenge . in Automatic Speech Recognition and Understanding (ASRU), IEEE Workshop on . IEEE, pp. 338-345, IEEE Automatic Speech Recognition and Understanding Workshop, Okinawa, Japan, 16/12/2017 . https://doi.org/10.1109/ASRU.2017.8268955