Listening like a speech-training app: Expert and non-expert listeners’ goodness ratings of children’s speech
Loading...
Access rights
openAccess
URL
Journal Title
Journal ISSN
Volume Title
A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
Date
2025
Major/Subject
Mcode
Degree programme
Language
en
Pages
23
Series
Clinical Linguistics and Phonetics
Abstract
Speech training apps are being developed that provide automatic feedback concerning children’s production of known target words, as a score on a 1–5 scale. However, this ‘goodness’ scale is still poorly understood. We investigated listeners’ ratings of ‘how many stars the app should provide as feedback’ on children’s utterances, and whether listener agreement is affected by clinical experience and/or access to anchor stimuli. In addition, we explored the association between goodness ratings and clinical measures of speech accuracy; the Percentage of Consonants Correct (PCC) and the Percentage of Phonemes Correct (PPC). Twenty speech-language pathologists and 20 non-expert listeners participated; half of the listeners in each group had access to anchor stimuli. The listeners rated 120 words, collected from children with and without speech sound disorder. Concerning reliability, intra-rater agreement was generally high, whereas inter-rater agreement was moderate. Access to anchor stimuli was associated with higher agreement, but only for non-expert listeners. Concerning the association between goodness ratings and the PCC/PPC, correlations were moderate for both listener groups, under both conditions. The results indicate that the task of rating goodness is difficult, regardless of clinical experience, and that access to anchor stimuli is insufficient for achieving reliable ratings. This raises concerns regarding the 1–5 rating scale as the means of feedback in speech training apps. More specific listener instructions, particularly regarding the intended context for the app, are suggested in collection of human ratings underlying the development of speech training apps. Until then, alternative means of feedback should be preferred.Description
Publisher Copyright: © 2024 The Author(s). Published with license by Taylor & Francis Group, LLC.
Keywords
automatic assessment, perceptual assessment, Speech accuracy, speech sound disorder
Other note
Citation
Strömbergsson, S, Fröjdh, M, Pettersson, M, Grósz, T, Getman, Y & Kurimo, M 2025, ' Listening like a speech-training app: Expert and non-expert listeners’ goodness ratings of children’s speech ', Clinical Linguistics and Phonetics, vol. 39, no. 2, pp. 144-165 . https://doi.org/10.1080/02699206.2024.2355470