Studies on binaural and monaural signal analysis methods and applications

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

Doctoral thesis (article-based)
Checking the digitized thesis and permission for publishing
Instructions for the author

Date

2009

Major/Subject

Mcode

Degree programme

Language

en

Pages

Verkkokirja (751 KB, 62 s.)

Series

TKK dissertations in media technology, 1

Abstract

Sound signals can contain a lot of information about the environment and the sound sources present in it. This thesis presents novel contributions to the analysis of binaural and monaural sound signals. Some new applications are introduced in this work, but the emphasis is on analysis methods. The three main topics of the thesis are computational estimation of sound source distance, analysis of binaural room impulse responses, and applications intended for augmented reality audio. A novel method for binaural sound source distance estimation is proposed. The method is based on learning the coherence between the sounds entering the left and right ears. Comparisons to an earlier approach are also made. It is shown that these kinds of learning methods can correctly recognize the distance of a speech sound source in most cases. Methods for analyzing binaural room impulse responses are investigated. These methods are able to locate the early reflections in time and also to estimate their directions of arrival. This challenging problem could not be tackled completely, but this part of the work is an important step towards accurate estimation of the individual early reflections from a binaural room impulse response. As the third part of the thesis, applications of sound signal analysis are studied. The most notable contributions are a novel eyes-free user interface controlled by finger snaps, and an investigation on the importance of features in audio surveillance. The results of this thesis are steps towards building machines that can obtain information on the surrounding environment based on sound. In particular, the research into sound source distance estimation functions as important basic research in this area. The applications presented could be valuable in future telecommunications scenarios, such as augmented reality audio.

Description

Keywords

audio signal analysis, audio signal processing, augmented reality audio, binaural signals, sound source distance, room impulse responses, reverberation time, eyes-free user interfaces, audio surveillance

Other note

Parts

  • [Publication 1]: Sampo Vesa. 2007. Sound source distance learning based on binaural signals. In: Proceedings of the 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2007). New Paltz, NY, USA. 21-24 October 2007, pages 271-274. © 2007 IEEE. By permission.
  • [Publication 2]: Sampo Vesa. 2009. Binaural sound source distance learning in rooms. IEEE Transactions on Audio, Speech, and Language Processing, volume 17, number 8, pages 1498-1507. © 2009 IEEE. By permission.
  • [Publication 3]: Sampo Vesa and Tapio Lokki. 2006. Detection of room reflections from a binaural room impulse response. In: Proceedings of the 9th International Conference on Digital Audio Effects (DAFx 2006). Montreal, Canada. 18-20 September 2006, pages 215-220. © 2006 by authors.
  • [Publication 4]: Sampo Vesa and Tapio Lokki. 2009. Segmentation and analysis of early reflections from a binaural room impulse response. Espoo, Finland: Helsinki University of Technology, Department of Media Technology. 10 pages. TKK Reports in Media Technology, Technical Report TKK-ME-R-1. © 2009 by authors.
  • [Publication 5]: Sampo Vesa and Aki Härmä. 2005. Automatic estimation of reverberation time from binaural signals. In: Proceedings of the 30th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005). Philadelphia, PA, USA. 18-23 March 2005, volume 3, pages 281-284. © 2005 IEEE. By permission.
  • [Publication 6]: Sampo Vesa and Tapio Lokki. 2005. An eyes-free user interface controlled by finger snaps. In: Proceedings of the 8th International Conference on Digital Audio Effects (DAFx 2005). Madrid, Spain. 20-22 September 2005, pages 262-265. © 2005 by authors.
  • [Publication 7]: Sampo Vesa. 2007. The effect of features on clustering in audio surveillance. In: Proceedings of the AES 30th International Conference on Intelligent Audio Environments. Saariselkä, Finland. 15-17 March 2007. 10 pages. © 2007 Audio Engineering Society (AES). By permission.
  • [Errata file]: Errata of publications 1, 3, 5, 6 and 7

Citation