Comparison and Analysis of New Curriculum Criteria for End-to-End ASR
Loading...
Access rights
openAccess
Journal Title
Journal ISSN
Volume Title
A4 Artikkeli konferenssijulkaisussa
This publication is imported from Aalto University research portal.
View publication in the Research portal
View/Open full text file from the Research portal
Other link related to publication
View publication in the Research portal
View/Open full text file from the Research portal
Other link related to publication
Date
2022
Department
Major/Subject
Mcode
Degree programme
Language
en
Pages
5
66-70
66-70
Series
Proceedings of Interspeech'22, Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Abstract
It is common knowledge that the quantity and quality of the training data play a significant role in the creation of a good machine learning model. In this paper, we take it one step further and demonstrate that the way the training examples are arranged is also of crucial importance. Curriculum Learning is built on the observation that organized and structured assimilation of knowledge has the ability to enable faster training and better comprehension. When humans learn to speak, they first try to utter basic phones and then gradually move towards more complex structures such as words and sentences. This methodology is known as Curriculum Learning, and we employ it in the context of Automatic Speech Recognition. We hypothesize that end-to-end models can achieve better performance when provided with an organized training set consisting of examples that exhibit an increasing level of difficulty (i.e. a curriculum). To impose structure on the training set and to define the notion of an easy example, we explored multiple scoring functions that either use feedback from an external neural network or incorporate feedback from the model itself. Empirical results show that with different curriculums we can balance the training times and the network’s performance.Description
The computational resources were provided by Aalto ScienceIT. We are grateful for the Academy of Finland project funding number 345790 in ICT 2023 programme's project”Understanding speech and scene with ears and eyes”
Keywords
Curriculum Learning, Automatic Speech Recognition, End-to-End
Other note
Citation
Karakasidis, G, Grósz, T & Kurimo, M 2022, Comparison and Analysis of New Curriculum Criteria for End-to-End ASR . in Proceedings of Interspeech'22 . vol. 2022-September, Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, International Speech Communication Association (ISCA), pp. 66-70, Interspeech, Incheon, Korea, Republic of, 18/09/2022 . https://doi.org/10.21437/Interspeech.2022-10046