Microphone array processing for parametric spatial audio techniques

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
School of Electrical Engineering | Doctoral thesis (article-based) | Defence date: 2016-11-04
Degree programme
126 + app. 89
Aalto University publication series DOCTORAL DISSERTATIONS, 195/2016
Reproduction of spatial properties of recorded sound scenes is increasingly recognised as a crucial element of all emerging immersive applications, with domestic or cinema-oriented audiovisual reproduction for entertainment, telepresence and immersive teleconferencing, and augmented and virtual reality being key examples. Such applications benefit from a general spatial audio processing framework, being able to exploit spatial information from a variety of recording formats in order to reproduce the original sound scene in a perceptually transparent way. Directional Audio Coding (DirAC) is a recent parametric spatial sound reproduction method that fulfils many of the requirements of such a framework. It is based on a universal 3D audio format known as B-format and achieves flexible and effective perceptual reproduction for loudspeakers or headphones. Part of this work focuses on the model of DirAC and aims to extend it. Firstly, it is shown that by taking into account information of the four-channel recording array that generates the B-format signals, it is possible to improve both analysis of the sound scene and reproduction. Secondly, these findings are generalised for various recording configurations. A further generalisation of DirAC is attempted in a spatial transform domain, the spherical harmonic domain (SHD), with higher-order B-format signals. Formulating the DirAC model in the SHD combines the perceptual effectiveness of DirAC with the increased resolution of higher-order B-format and overcomes most limitations of traditional DirAC. Some novel applications of parametric processing of spatial sound are demonstrated for sound and music engineering. The first shows the potential of modifying the spatial information in the recording for creative manipulation of sound scenes, while the second shows improvement of music reproduction captured with established surround recording techniques.The effectiveness of parametric techniques in conveying distance and externalisation cues over headphones, led to research in controlling the perceived distance using loudspeakers in a room. This is achieved by manipulating the direct-to-reverberant energy ratio using a compact loudspeaker array with a variable directivity pattern. Lastly, apart from reproduction of recorded sound scenes, auralisation of the spatial properties of acoustical spaces are of interest. We demonstrate that this problem is well-suited to parametric spatial analysis. The nature of room impulse responses captured with a large microphone array allows very high-resolution approaches, and such approaches for detection and localisation of multiple reflections in a single short observation window are applied and compared.
Supervising professor
Pulkki, Ville, Prof., Aalto University, Department of Signal Processing and Acoustics, Finland
Thesis advisor
Pulkki, Ville, Prof., Aalto University, Department of Signal Processing and Acoustics, Finland
spatial audio, microphone arrays, sound field analysis, sound reproduction
Other note
  • [Publication 1]: Archontis Politis, Ville Pulkki. Broadband Analysis and Synthesis for Directional Audio Coding using A-format Input Signals. In 131st Convention of the Audio Engineering Society, New York, NY, USA, October 2011.
  • [Publication 2]: Archontis Politis, Tapani Pihlajamäki, Ville Pulkki. Parametric Spatial Audio Effects. In 15th International Conference on Digital Audio Effects (DAFx-12), York, UK, September 2012.
  • [Publication 3]: Archontis Politis, Mikko-Ville Laitinen, Jukka Ahonen, Ville Pulkki. Parametric Spatial Audio Processing of Spaced Microphone Array Recordings for Multichannel Reproduction. Journal of the Audio Engineering Society, 63, 4, 216–227, April 2015.
    DOI: 10.17743/jaes.2015.0015 View at publisher
  • [Publication 4]: Archontis Politis, Symeon Delikaris-Manias, Ville Pulkki. Direction-of-Arrival and Diffuseness Estimation Above Spatial Aliasing for Symmetrical Directional Microphone Arrays. In IEEE International Conference on Audio, Speech and Signal Processing, Brisbane, Australia, April 2015.
    DOI: 10.1109/ICASSP.2015.7177921 View at publisher
  • [Publication 5]: Archontis Politis, Juha Vilkamo, Ville Pulkki. Sector-Based Parametric Sound Field Reproduction in the Spherical Harmonic Domain. IEEE Journal of Selected Topics in Signal Processing, 9, 5, 852–866, August 2015.
    DOI: 10.1109/JSTSP.2015.2415762 View at publisher
  • [Publication 6]: Mikko-Ville Laitinen, Archontis Politis, Ilkka Huhtakallio, Ville Pulkki. Controlling the Perceived Distance of an Auditory Object by Manipulation of Loudspeaker Directivity. JASA Express Letters, 137, 6, EL462–EL468, June 2015.
    DOI: 10.1121/1.4921678 View at publisher
  • [Publication 7]: Sakari Tervo, Archontis Politis. Direction of Arrival Estimation of Reflections from Room Impulse Responses Using a Spherical Microphone Array. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 23, 10, 1539–1551, October 2015.
    DOI: 10.1109/TASLP.2015.2439573 View at publisher