Analysis and classification of phonation types in speech and singing voice
No Thumbnail Available
Access rights
openAccess
URL
Journal Title
Journal ISSN
Volume Title
A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
Date
2020-04
Major/Subject
Mcode
Degree programme
Language
en
Pages
15
33-47
33-47
Series
Speech Communication, Volume 118
Abstract
Both in speech and singing, humans are capable of generating sounds of different phonation types (e.g., breathy, modal and pressed). Previous studies in the analysis and classification of phonation types have mainly used voice source features derived using glottal inverse filtering (GIF). Even though glottal source features are useful in discriminating phonation types in speech, their performance deteriorates in singing voice due to the high fundamental frequency of these sounds that reduces the accuracy of source-filter separation in GIF. In the present study, features describing the glottal source were computed using three signal processing methods that do not compute source-filter separation. These three methods are zero frequency filtering (ZFF), zero time windowing (ZTW) and single frequency filtering (SFF). From each method, a group of scalar features were extracted. In addition, cepstral coefficients were derived from the spectra computed using ZTW and SFF. Experiments were conducted with the proposed features to analyse and classify phonation types using three phonation types (breathy, modal and pressed) for speech and singing voice. Statistical pair-wise comparisons between the phonation types showed that most of the features were capable of separating the phonation types significantly for speech and singing voices. Classification with support vector machine classifiers indicated that the proposed features and their combinations showed improved accuracy compared to usually employed glottal source features and mel-frequency cepstral coefficients (MFCCs).Description
Keywords
Phonation type, Voice quality, Singing voice, Glottal source, Glottal inverse filtering, Zero frequency filtering (ZFF), Zero time windowing (ZTW), Single frequency filtering (SFF)
Other note
Citation
Kadiri, S R, Alku, P & Yegnanarayana, B 2020, ' Analysis and classification of phonation types in speech and singing voice ', Speech Communication, vol. 118, pp. 33-47 . https://doi.org/10.1016/j.specom.2020.02.004