Excitation Features of Speech for Speaker-Specific Emotion Detection

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorKadiri, Sudarsana Reddyen_US
dc.contributor.authorAlku, Paavoen_US
dc.contributor.departmentDepartment of Signal Processing and Acousticsen
dc.contributor.groupauthorSpeech Communication Technologyen
dc.date.accessioned2020-04-28T06:50:29Z
dc.date.available2020-04-28T06:50:29Z
dc.date.issued2020-01-01en_US
dc.description.abstractIn this article, we study emotion detection from speech in a speaker-specific scenario. By parameterizing the excitation component of voiced speech, the study explores deviations between emotional speech (e.g., speech produced in anger, happiness, sadness, etc.) and neutral speech (i.e., non-emotional) to develop an automatic emotion detection system. The excitation features used in this study are the instantaneous fundamental frequency, the strength of excitation and the energy of excitation. The Kullback-Leibler (KL) distance is computed to measure the similarity between feature distributions of emotional and neutral speech. Based on the KL distance value between a test utterance and an utterance produced in a neutral state by the same speaker, a detection decision is made by the system. In the training of the proposed system, only three neutral utterances produced by the speaker were used, unlike in most existing emotion recognition and detection systems that call for large amounts of training data (both emotional and neutral) by several speakers. In addition, the proposed system is independent of language or lexical content. The system is evaluated using two databases of emotional speech. The performance of the proposed detection method is shown to be better than that of reference methods.en
dc.description.versionPeer revieweden
dc.format.extent10
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationKadiri, S R & Alku, P 2020, 'Excitation Features of Speech for Speaker-Specific Emotion Detection', IEEE Access, vol. 8, 9046041, pp. 60382-60391. https://doi.org/10.1109/ACCESS.2020.2982954en
dc.identifier.doi10.1109/ACCESS.2020.2982954en_US
dc.identifier.issn2169-3536
dc.identifier.otherPURE UUID: 8c073f67-e83f-4278-a3a2-bcff37fbb4d4en_US
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/8c073f67-e83f-4278-a3a2-bcff37fbb4d4en_US
dc.identifier.otherPURE LINK: http://www.scopus.com/inward/record.url?scp=85083447226&partnerID=8YFLogxK
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/42557301/Kadiri_Excitation_features_of_speech_IEEEAccess.pdfen_US
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/43889
dc.identifier.urnURN:NBN:fi:aalto-202004282888
dc.language.isoenen
dc.publisherIEEE
dc.relation.ispartofseriesIEEE Accessen
dc.relation.ispartofseriesVolume 8, pp. 60382-60391en
dc.rightsopenAccessen
dc.subject.keywordemotion detectionen_US
dc.subject.keywordexcitation sourceen_US
dc.subject.keywordKullback-Leibler (KL) distanceen_US
dc.subject.keywordlinear prediction (LP) analysisen_US
dc.subject.keywordparalinguisticsen_US
dc.subject.keywordSpeech analysisen_US
dc.subject.keywordzero frequency filtering (ZFF)en_US
dc.titleExcitation Features of Speech for Speaker-Specific Emotion Detectionen
dc.typeA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessäfi
dc.type.versionpublishedVersion

Files