Cognitive and probabilistic basis of prominence perception in speech

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.advisorRäsänen, Okko, Dr., Aalto University, Department of Signal Processing and Acoustics, Finland
dc.contributor.authorKakouros, Sofoklis
dc.contributor.departmentSignaalinkäsittelyn ja akustiikan laitosfi
dc.contributor.departmentDepartment of Signal Processing and Acousticsen
dc.contributor.schoolSähkötekniikan korkeakoulufi
dc.contributor.schoolSchool of Electrical Engineeringen
dc.contributor.supervisorLaine, Unto K., Prof. Emer., Aalto University, Department of Signal Processing and Acoustics, Finland
dc.contributor.supervisorAlku, Paavo, Acad. Prof., Aalto University, Department of Signal Processing and Acoustics, Finland
dc.date.accessioned2017-05-12T09:01:57Z
dc.date.available2017-05-12T09:01:57Z
dc.date.defence2017-05-26
dc.date.issued2017
dc.description.abstractThe research in this thesis examines the topic of the cognitive and probabilistic nature of prominence perception in speech. In recent years, there has been an accumulating number of studies from linguistics, phonetics, and neuroscience providing evidence that (i) prominence is related to attention- and expectation-based factors, (ii) frequency and predictability effects hold an important role in language processing, accounting for several linguistic phenomena, and (iii) the human brain represents information in a probabilistic way, with humans behaving as optimal probabilistic observers. On the basis of this evidence, the relationship between prominence, attention, and predictability is explored. A hypothesis is proposed suggesting that prominence perception in speech is connected with the unpredictability of prosodic features that draw the listeners' attention to the surprising aspects of the input. This thesis consists of a series of computational and behavioral studies that investigate different aspects of the prominence–attention–predictability tripartite. The core idea throughout this work is to investigate the probabilistic relations that take place at the acoustic prosodic domain through statistical modeling of the acoustic correlates of prominence, examining their relationship with the concurrent prominent/non-prominent units. As the probabilistic view of prominence also implies that listeners utilize some type of statistical learning mechanism operating at the suprasegmental acoustic prosodic level, a number of behavioral experiments are also conducted. The aim of these experiments is to understand whether human listeners are sensitive to the statistical regularities of suprasegmental speech acoustics and, if so, to what extent. A basic application of statistical models for the automatic detection of prominence in speech is also reported. As a result of these studies, the thesis shows that predictability at the acoustic prosodic level is strongly correlated with human listeners' perception of prominence in speech. This statistical connection, however, is not fixed but depends on the listeners' experience with the language and thereby with subjective expectations of prosodic outcomes. This is illuminated by results that show that the human perceptual system appears to quickly adapt to the suprasegmental probabilistic structure of the incoming speech, causing the prosodic patterns that are less frequent in the recent discourse-specific acoustics to be more prominent. Thus, the experiments indicate a type of statistical learning mechanism operating at the suprasegmental acoustic level. Finally, a practical application of the predictability framework to the unsupervised detection of prominence in speech is described. Experiments in several languages show that the method provides high agreement with human judgments of prominence despite not having access to prominence labeling during training of the detector.en
dc.format.extent91 + app. 142
dc.format.mimetypeapplication/pdfen
dc.identifier.isbn978-952-60-7423-8 (electronic)
dc.identifier.isbn978-952-60-7424-5 (printed)
dc.identifier.issn1799-4942 (electronic)
dc.identifier.issn1799-4934 (printed)
dc.identifier.issn1799-4934 (ISSN-L)
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/26213
dc.identifier.urnURN:ISBN:978-952-60-7423-8
dc.language.isoenen
dc.opnWagner, Petra, Prof. Dr., Bielefeld University, Germany
dc.publisherAalto Universityen
dc.publisherAalto-yliopistofi
dc.relation.haspart[Publication 1]: Sofoklis Kakouros, Okko Räsänen, and Unto K. Laine. Attention Based Temporal Filtering of Sensory Signals for Data Redundancy Reduction. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP-2013), pp. 3188–3192, Vancouver, Canada, May 2013. DOI: 10.1109/ICASSP.2013.6638246
dc.relation.haspart[Publication 2]: Sofoklis Kakouros and Okko Räsänen. Perception of Sentence Stress in English Infant Directed Speech. In 15th Annual Conference of the International Speech Communication Association (Interspeech-2014), pp. 1821–1825, Singapore, September 2014.
dc.relation.haspart[Publication 3]: Sofoklis Kakouros and Okko Räsänen. Perception of Sentence Stress in Speech Correlates with the Temporal Unpredictability of Prosodic Features. Cognitive Science, 40(7), 1739–1774, September 2016. DOI: 10.1111/cogs.12306
dc.relation.haspart[Publication 4]: Sofoklis Kakouros and Okko Räsänen. Analyzing the Predictability of Lexeme-specific Prosodic Features as a Cue to Sentence Prominence. In 37th Annual Conference of the Cognitive Science Society (CogSci-2015), pp. 1039–1044, Pasadena, California, July 2015.
dc.relation.haspart[Publication 5]: Sofoklis Kakouros, Joris Pelemans, Lyan Verwimp, Patrick Wambacq, and Okko Räsänen. Analyzing the Contribution of Top-down Lexical and Bottom-up Acoustic Cues in the Detection of Sentence Prominence. In 17th Annual Conference of the International Speech Communication Association (Interspeech-2016), pp. 1074–1078, San Francisco, California, September 2016. DOI: 10.21437/Interspeech.2016-926
dc.relation.haspart[Publication 6]: Sofoklis Kakouros and Okko Räsänen. 3PRO – An Unsupervised Method for the Automatic Detection of Sentence Prominence in Speech. Speech Communication, 82, 67–84, September 2016. DOI: 10.1016/j.specom.2016.06.004
dc.relation.haspart[Publication 7]: Sofoklis Kakouros, Nelli Salminen, and Okko Räsänen. Making Predictable Unpredictable with Style – Behavioral and Electrophysiological Evidence for the Critical Role of Prosodic Expectations in the Perception of Prominence in Speech. Submitted to Neuropsychologia.
dc.relation.ispartofseriesAalto University publication series DOCTORAL DISSERTATIONSen
dc.relation.ispartofseries88/2017
dc.revHouse, David, Prof., KTH Royal Institute of Technology, Sweden
dc.revRichmond, Korin, Dr., University of Edinburgh, UK
dc.subject.helecontekniikka
dc.subject.heleconoppiminen
dc.subject.heleconviestintä
dc.subject.keywordprosodyen
dc.subject.keywordprominenceen
dc.subject.keywordattentionen
dc.subject.keywordspeech perceptionen
dc.subject.keywordstatistical learningen
dc.subject.keywordstimulus predictabilityen
dc.subject.keywordspeech analysisen
dc.subject.keywordcognitive modelingen
dc.subject.otherAcousticsen
dc.subject.otherPsychologyen
dc.subject.otherLinguisticsen
dc.titleCognitive and probabilistic basis of prominence perception in speechen
dc.typeG5 Artikkeliväitöskirjafi
dc.type.dcmitypetexten
dc.type.ontasotDoctoral dissertation (article-based)en
dc.type.ontasotVäitöskirja (artikkeli)fi
local.aalto.archiveyes
local.aalto.formfolder2017_05_11_klo_14_43

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
isbn9789526074238.pdf
Size:
1.74 MB
Format:
Adobe Portable Document Format
Description: