Studies on auditory processing of spatial sound and speech by neuromagnetic measurements and computational modeling

No Thumbnail Available
Journal Title
Journal ISSN
Volume Title
Doctoral thesis (article-based)
Checking the digitized thesis and permission for publishing
Instructions for the author
Degree programme
74, [app]
Report / Helsinki University of Technology, Laboratory of Acoustics and Audio Signal Processing, 74
This thesis addresses the auditory processing of spatial sound and speech. The thesis consists of two research branches: one, magnetoencephalographic (MEG) brain measurements on spatial localization and speech perception, and two, construction of computational auditory scene analysis models, which exploit spatial cues and other cues that are robust in reverberant environments. In the MEG research branch, we have addressed the processing of the spatial stimuli in the auditory cortex through studies concentrating to the following issues: processing of sound source location with realistic spatial stimuli, spatial processing of speech vs. non-speech stimuli, and finally processing of range of spatial location cues in the auditory cortex. Our main findings are as follows: Both auditory cortices respond more vigorously to contralaterally presented sound, whereby responses exhibit systematic tuning to the sound source direction. Responses and response dynamics are generally larger in the right hemisphere, which indicates right hemispheric specialization in the spatial processing. These observations hold over the range of speech and non-speech stimuli. The responses to speech sounds are decreased markedly if the natural periodic speech excitation is changed to random noise sequence. Moreover, the activation strength of the right auditory cortex seems to reflect processing of spatial cues, so that the dynamical differences are larger and the angular organization is more orderly for realistic spatial stimuli compared to impoverished spatial stimuli (e.g. isolated interaural time and level difference cues). In the auditory modeling part, we constructed models for the recognition of speech in the presence of interference. Firstly, we constructed a system using binaural cues in order to segregate target speech from spatially separated interference, and showed that the system outperforms a conventional approach at low signal-to-noise ratios. Secondly, we constructed a single channel system that is robust in room reverberation using strong speech modulations as robust cues, and showed that it outperforms a baseline approach in the most reverberant test conditions. In this case, the baseline approach was specifically optimized for recognition of speech in reverberation. In summary, this thesis addresses the auditory processing of spatial sound and speech in both brain measurement and auditory modeling. The studies aim to clarify cortical processes of sound localization, and to construct computational auditory models for sound segregation exploiting spatial cues, and strong speech modulations as robust cues in reverberation.
spatial localization, auditory cortex, MEG, N1m, binaural models, CASA, missing data speech recognition
Other note
  • Palomäki K., Alku P., Mäkinen V., May P. and Tiitinen H., 2000. Sound localization in the human brain: neuromagnetic observations. NeuroReport 11 (7), pages 1535-1538.
  • Palomäki K. J., Tiitinen H., Mäkinen V., May P. and Alku P., 2002. Cortical processing of speech sounds and their analogues in a spatial auditory environment. Cognitive Brain Research 14 (2), pages 294-299. [article2.pdf] © 2002 Elsevier Science. By permission.
  • Alku P., Sivonen P., Palomäki K. J. and Tiitinen H., 2001. The periodic structure of vowel sounds is reflected in human electromagnetic brain responses. Neuroscience Letters 298 (1), pages 25-28. [article3.pdf] © 2001 Elsevier Science. By permission.
  • Palomäki K. J., Tiitinen H., Mäkinen V., May P. and Alku P., 2005. Spatial processing in human auditory cortex: the effects of 3D, ITD, and ILD stimulation techniques. Cognitive Brain Research, accepted for publication. [article4.pdf] © 2005 Elsevier Science. By permission.
  • Palomäki K. J., Brown G. J. and Wang D. L., 2004. A binaural processor for missing data speech recognition in the presence of noise and small-room reverberation. Speech Communication 43 (4), pages 361-378. [article5.pdf] © 2004 Elsevier Science. By permission.
  • Palomäki K. J., Brown G. J. and Barker J., 2004. Techniques for handling convolutional distortion with 'missing data' automatic speech recognition. Speech Communication 43 (1-2), pages 123-142. [article6.pdf] © 2004 Elsevier Science. By permission.
Permanent link to this item