wav2vec2-based Speech Rating System for Children with Speech Sound Disorder

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorGetman, Yaroslaven_US
dc.contributor.authorAl-Ghezi, Ragheben_US
dc.contributor.authorVoskoboinik, Ekaterinaen_US
dc.contributor.authorGrósz, Tamásen_US
dc.contributor.authorKurimo, Mikkoen_US
dc.contributor.authorSalvi, Giampieroen_US
dc.contributor.authorSvendsen, Torbjørnen_US
dc.contributor.authorStrömbergsson, Sofiaen_US
dc.contributor.departmentDepartment of Signal Processing and Acousticsen
dc.contributor.groupauthorSpeech Recognitionen
dc.contributor.organizationDepartment of Signal Processing and Acousticsen_US
dc.contributor.organizationSpeech Recognitionen_US
dc.contributor.organizationNorwegian University of Science and Technologyen_US
dc.contributor.organizationKarolinska Instituteten_US
dc.date.accessioned2022-10-19T06:43:24Z
dc.date.available2022-10-19T06:43:24Z
dc.date.issued2022en_US
dc.descriptionThe computational resources were provided by Aalto ScienceIT. This work was supported by NordForsk through the funding to Technology-enhanced foreign and second-language learning of Nordic languages, project number 103893.
dc.description.abstractSpeaking is a fundamental way of communication, developed at a young age. Unfortunately, some children with speech sound disorder struggle to acquire this skill, hindering their ability to communicate efficiently. Speech therapies, which could aid these children in speech acquisition, greatly rely on speech practice trials and accurate feedback about their pronunciations. To enable home therapy and lessen the burden on speech-language pathologists, we need a highly accurate and automatic way of assessing the quality of speech uttered by young children. Our work focuses on exploring the applicability of state-of-the-art self-supervised, deep acoustic models, mainly wav2vec2, for this task. The empirical results highlight that these self-supervised models are superior to traditional approaches and close the gap between machine and human performance.en
dc.description.versionPeer revieweden
dc.format.extent5
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationGetman, Y, Al-Ghezi, R, Voskoboinik, E, Grósz, T, Kurimo, M, Salvi, G, Svendsen, T & Strömbergsson, S 2022, wav2vec2-based Speech Rating System for Children with Speech Sound Disorder. in Proceedings of Interspeech'22. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, International Speech Communication Association (ISCA), pp. 3618-3622, Interspeech, Incheon, Korea, Republic of, 18/09/2022. https://doi.org/10.21437/Interspeech.2022-10103en
dc.identifier.doi10.21437/Interspeech.2022-10103en_US
dc.identifier.issn2958-1796
dc.identifier.otherPURE UUID: 4fe92000-e12c-41bb-afd3-20eaf52bbcf7en_US
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/4fe92000-e12c-41bb-afd3-20eaf52bbcf7en_US
dc.identifier.otherPURE LINK: http://www.scopus.com/inward/record.url?scp=85140056600&partnerID=8YFLogxK
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/89263566/Getman_et_alii_wav2vec2_based_Speech_Rating_System_for_Children_with_Speech_Sound_Disorder.pdfen_US
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/117210
dc.identifier.urnURN:NBN:fi:aalto-202210195998
dc.language.isoenen
dc.relation.ispartofInterspeechen
dc.relation.ispartofseriesProceedings of Interspeech'22en
dc.relation.ispartofseriespp. 3618-3622en
dc.relation.ispartofseriesProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECHen
dc.rightsopenAccessen
dc.subject.keywordspeech assessmenten_US
dc.subject.keywordgoodness of pronunciationen_US
dc.subject.keywordchildren speechen_US
dc.subject.keywordASRen_US
dc.subject.keywordwav2vec2en_US
dc.titlewav2vec2-based Speech Rating System for Children with Speech Sound Disorderen
dc.typeA4 Artikkeli konferenssijulkaisussafi
dc.type.versionpublishedVersion

Files