Mel-frequency cepstral coefficients of voice source waveforms for classification of phonation types in speech

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorKadiri, Sudarsana Reddyen_US
dc.contributor.authorAlku, Paavoen_US
dc.contributor.departmentDepartment of Signal Processing and Acousticsen
dc.contributor.groupauthorSpeech Communication Technologyen
dc.date.accessioned2020-01-02T14:10:00Z
dc.date.available2020-01-02T14:10:00Z
dc.date.issued2019-01-01en_US
dc.description.abstractVoice source characteristics in different phonation types vary due to the tension of laryngeal muscles along with the respiratory effort. This study investigates the use of mel-frequency cepstral coefficients (MFCCs) derived from voice source waveforms for classification of phonation types in speech. The cepstral coefficients are computed using two source waveforms: (1) glottal flow waveforms estimated by the quasi-closed phase (QCP) glottal inverse filtering method and (2) approximate voice source waveforms obtained using the zero frequency filtering (ZFF) method. QCP estimates voice source waveforms based on the source-filter decomposition while ZFF yields source waveforms without explicitly computing the source-filter decomposition. Experiments using MFCCs computed from the two source waveforms show improved accuracy in classification of phonation types compared to the existing voice source features and conventional MFCC features. Further, it is observed that the proposed features have complimentary information to the existing features.en
dc.description.versionPeer revieweden
dc.format.extent5
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationKadiri, S R & Alku, P 2019, Mel-frequency cepstral coefficients of voice source waveforms for classification of phonation types in speech. in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. vol. 2019-September, Interspeech - Annual Conference of the International Speech Communication Association, INTERSPEECH, International Speech Communication Association (ISCA), pp. 2508-2512, Interspeech, Graz, Austria, 15/09/2019. https://doi.org/10.21437/Interspeech.2019-2863en
dc.identifier.doi10.21437/Interspeech.2019-2863en_US
dc.identifier.issn2308-457X
dc.identifier.otherPURE UUID: d512c3fb-cf23-4209-b010-9bcb51febe3aen_US
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/d512c3fb-cf23-4209-b010-9bcb51febe3aen_US
dc.identifier.otherPURE LINK: http://www.scopus.com/inward/record.url?scp=85074692228&partnerID=8YFLogxK
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/38768883/ELEC_Kadiri_Mel_frequency_cepstral_INTERSPEECH.pdfen_US
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/42236
dc.identifier.urnURN:NBN:fi:aalto-202001021347
dc.language.isoenen
dc.relation.ispartofInterspeechen
dc.relation.ispartofseriesProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECHen
dc.relation.ispartofseriesVolume 2019-September, pp. 2508-2512en
dc.relation.ispartofseriesInterspeech - Annual Conference of the International Speech Communication Association, INTERSPEECHen
dc.rightsopenAccessen
dc.subject.keywordGlottal inverse filteringen_US
dc.subject.keywordPhonation typeen_US
dc.subject.keywordSpeech analysisen_US
dc.subject.keywordVoice qualityen_US
dc.subject.keywordVoice sourceen_US
dc.subject.keywordZero frequency filteringen_US
dc.titleMel-frequency cepstral coefficients of voice source waveforms for classification of phonation types in speechen
dc.typeA4 Artikkeli konferenssijulkaisussafi
dc.type.versionpublishedVersion

Files