Cognitive and probabilistic basis of prominence perception in speech
dc.contributor | Aalto-yliopisto | fi |
dc.contributor | Aalto University | en |
dc.contributor.advisor | Räsänen, Okko, Dr., Aalto University, Department of Signal Processing and Acoustics, Finland | |
dc.contributor.author | Kakouros, Sofoklis | |
dc.contributor.department | Signaalinkäsittelyn ja akustiikan laitos | fi |
dc.contributor.department | Department of Signal Processing and Acoustics | en |
dc.contributor.school | Sähkötekniikan korkeakoulu | fi |
dc.contributor.school | School of Electrical Engineering | en |
dc.contributor.supervisor | Laine, Unto K., Prof. Emer., Aalto University, Department of Signal Processing and Acoustics, Finland | |
dc.contributor.supervisor | Alku, Paavo, Acad. Prof., Aalto University, Department of Signal Processing and Acoustics, Finland | |
dc.date.accessioned | 2017-05-12T09:01:57Z | |
dc.date.available | 2017-05-12T09:01:57Z | |
dc.date.defence | 2017-05-26 | |
dc.date.issued | 2017 | |
dc.description.abstract | The research in this thesis examines the topic of the cognitive and probabilistic nature of prominence perception in speech. In recent years, there has been an accumulating number of studies from linguistics, phonetics, and neuroscience providing evidence that (i) prominence is related to attention- and expectation-based factors, (ii) frequency and predictability effects hold an important role in language processing, accounting for several linguistic phenomena, and (iii) the human brain represents information in a probabilistic way, with humans behaving as optimal probabilistic observers. On the basis of this evidence, the relationship between prominence, attention, and predictability is explored. A hypothesis is proposed suggesting that prominence perception in speech is connected with the unpredictability of prosodic features that draw the listeners' attention to the surprising aspects of the input. This thesis consists of a series of computational and behavioral studies that investigate different aspects of the prominence–attention–predictability tripartite. The core idea throughout this work is to investigate the probabilistic relations that take place at the acoustic prosodic domain through statistical modeling of the acoustic correlates of prominence, examining their relationship with the concurrent prominent/non-prominent units. As the probabilistic view of prominence also implies that listeners utilize some type of statistical learning mechanism operating at the suprasegmental acoustic prosodic level, a number of behavioral experiments are also conducted. The aim of these experiments is to understand whether human listeners are sensitive to the statistical regularities of suprasegmental speech acoustics and, if so, to what extent. A basic application of statistical models for the automatic detection of prominence in speech is also reported. As a result of these studies, the thesis shows that predictability at the acoustic prosodic level is strongly correlated with human listeners' perception of prominence in speech. This statistical connection, however, is not fixed but depends on the listeners' experience with the language and thereby with subjective expectations of prosodic outcomes. This is illuminated by results that show that the human perceptual system appears to quickly adapt to the suprasegmental probabilistic structure of the incoming speech, causing the prosodic patterns that are less frequent in the recent discourse-specific acoustics to be more prominent. Thus, the experiments indicate a type of statistical learning mechanism operating at the suprasegmental acoustic level. Finally, a practical application of the predictability framework to the unsupervised detection of prominence in speech is described. Experiments in several languages show that the method provides high agreement with human judgments of prominence despite not having access to prominence labeling during training of the detector. | en |
dc.format.extent | 91 + app. 142 | |
dc.format.mimetype | application/pdf | en |
dc.identifier.isbn | 978-952-60-7423-8 (electronic) | |
dc.identifier.isbn | 978-952-60-7424-5 (printed) | |
dc.identifier.issn | 1799-4942 (electronic) | |
dc.identifier.issn | 1799-4934 (printed) | |
dc.identifier.issn | 1799-4934 (ISSN-L) | |
dc.identifier.uri | https://aaltodoc.aalto.fi/handle/123456789/26213 | |
dc.identifier.urn | URN:ISBN:978-952-60-7423-8 | |
dc.language.iso | en | en |
dc.opn | Wagner, Petra, Prof. Dr., Bielefeld University, Germany | |
dc.publisher | Aalto University | en |
dc.publisher | Aalto-yliopisto | fi |
dc.relation.haspart | [Publication 1]: Sofoklis Kakouros, Okko Räsänen, and Unto K. Laine. Attention Based Temporal Filtering of Sensory Signals for Data Redundancy Reduction. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP-2013), pp. 3188–3192, Vancouver, Canada, May 2013. DOI: 10.1109/ICASSP.2013.6638246 | |
dc.relation.haspart | [Publication 2]: Sofoklis Kakouros and Okko Räsänen. Perception of Sentence Stress in English Infant Directed Speech. In 15th Annual Conference of the International Speech Communication Association (Interspeech-2014), pp. 1821–1825, Singapore, September 2014. | |
dc.relation.haspart | [Publication 3]: Sofoklis Kakouros and Okko Räsänen. Perception of Sentence Stress in Speech Correlates with the Temporal Unpredictability of Prosodic Features. Cognitive Science, 40(7), 1739–1774, September 2016. DOI: 10.1111/cogs.12306 | |
dc.relation.haspart | [Publication 4]: Sofoklis Kakouros and Okko Räsänen. Analyzing the Predictability of Lexeme-specific Prosodic Features as a Cue to Sentence Prominence. In 37th Annual Conference of the Cognitive Science Society (CogSci-2015), pp. 1039–1044, Pasadena, California, July 2015. | |
dc.relation.haspart | [Publication 5]: Sofoklis Kakouros, Joris Pelemans, Lyan Verwimp, Patrick Wambacq, and Okko Räsänen. Analyzing the Contribution of Top-down Lexical and Bottom-up Acoustic Cues in the Detection of Sentence Prominence. In 17th Annual Conference of the International Speech Communication Association (Interspeech-2016), pp. 1074–1078, San Francisco, California, September 2016. DOI: 10.21437/Interspeech.2016-926 | |
dc.relation.haspart | [Publication 6]: Sofoklis Kakouros and Okko Räsänen. 3PRO – An Unsupervised Method for the Automatic Detection of Sentence Prominence in Speech. Speech Communication, 82, 67–84, September 2016. DOI: 10.1016/j.specom.2016.06.004 | |
dc.relation.haspart | [Publication 7]: Sofoklis Kakouros, Nelli Salminen, and Okko Räsänen. Making Predictable Unpredictable with Style – Behavioral and Electrophysiological Evidence for the Critical Role of Prosodic Expectations in the Perception of Prominence in Speech. Submitted to Neuropsychologia. | |
dc.relation.ispartofseries | Aalto University publication series DOCTORAL DISSERTATIONS | en |
dc.relation.ispartofseries | 88/2017 | |
dc.rev | House, David, Prof., KTH Royal Institute of Technology, Sweden | |
dc.rev | Richmond, Korin, Dr., University of Edinburgh, UK | |
dc.subject.helecon | tekniikka | |
dc.subject.helecon | oppiminen | |
dc.subject.helecon | viestintä | |
dc.subject.keyword | prosody | en |
dc.subject.keyword | prominence | en |
dc.subject.keyword | attention | en |
dc.subject.keyword | speech perception | en |
dc.subject.keyword | statistical learning | en |
dc.subject.keyword | stimulus predictability | en |
dc.subject.keyword | speech analysis | en |
dc.subject.keyword | cognitive modeling | en |
dc.subject.other | Acoustics | en |
dc.subject.other | Psychology | en |
dc.subject.other | Linguistics | en |
dc.title | Cognitive and probabilistic basis of prominence perception in speech | en |
dc.type | G5 Artikkeliväitöskirja | fi |
dc.type.dcmitype | text | en |
dc.type.ontasot | Doctoral dissertation (article-based) | en |
dc.type.ontasot | Väitöskirja (artikkeli) | fi |
local.aalto.archive | yes | |
local.aalto.formfolder | 2017_05_11_klo_14_43 |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- isbn9789526074238.pdf
- Size:
- 1.74 MB
- Format:
- Adobe Portable Document Format
- Description: