Microphone array processing for parametric spatial audio techniques

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.advisorPulkki, Ville, Prof., Aalto University, Department of Signal Processing and Acoustics, Finland
dc.contributor.authorPolitis, Archontis
dc.contributor.departmentSignaalinkäsittelyn ja akustiikan laitosfi
dc.contributor.departmentDepartment of Signal Processing and Acousticsen
dc.contributor.labCommunication Acousticsen
dc.contributor.schoolSähkötekniikan korkeakoulufi
dc.contributor.schoolSchool of Electrical Engineeringen
dc.contributor.supervisorPulkki, Ville, Prof., Aalto University, Department of Signal Processing and Acoustics, Finland
dc.date.accessioned2016-09-28T09:01:26Z
dc.date.available2016-09-28T09:01:26Z
dc.date.defence2016-11-04
dc.date.issued2016
dc.description.abstractReproduction of spatial properties of recorded sound scenes is increasingly recognised as a crucial element of all emerging immersive applications, with domestic or cinema-oriented audiovisual reproduction for entertainment, telepresence and immersive teleconferencing, and augmented and virtual reality being key examples. Such applications benefit from a general spatial audio processing framework, being able to exploit spatial information from a variety of recording formats in order to reproduce the original sound scene in a perceptually transparent way. Directional Audio Coding (DirAC) is a recent parametric spatial sound reproduction method that fulfils many of the requirements of such a framework. It is based on a universal 3D audio format known as B-format and achieves flexible and effective perceptual reproduction for loudspeakers or headphones. Part of this work focuses on the model of DirAC and aims to extend it. Firstly, it is shown that by taking into account information of the four-channel recording array that generates the B-format signals, it is possible to improve both analysis of the sound scene and reproduction. Secondly, these findings are generalised for various recording configurations. A further generalisation of DirAC is attempted in a spatial transform domain, the spherical harmonic domain (SHD), with higher-order B-format signals. Formulating the DirAC model in the SHD combines the perceptual effectiveness of DirAC with the increased resolution of higher-order B-format and overcomes most limitations of traditional DirAC. Some novel applications of parametric processing of spatial sound are demonstrated for sound and music engineering. The first shows the potential of modifying the spatial information in the recording for creative manipulation of sound scenes, while the second shows improvement of music reproduction captured with established surround recording techniques.The effectiveness of parametric techniques in conveying distance and externalisation cues over headphones, led to research in controlling the perceived distance using loudspeakers in a room. This is achieved by manipulating the direct-to-reverberant energy ratio using a compact loudspeaker array with a variable directivity pattern. Lastly, apart from reproduction of recorded sound scenes, auralisation of the spatial properties of acoustical spaces are of interest. We demonstrate that this problem is well-suited to parametric spatial analysis. The nature of room impulse responses captured with a large microphone array allows very high-resolution approaches, and such approaches for detection and localisation of multiple reflections in a single short observation window are applied and compared.en
dc.format.extent126 + app. 89
dc.format.mimetypeapplication/pdfen
dc.identifier.isbn978-952-60-7037-7 (electronic)
dc.identifier.isbn978-952-60-7038-4 (printed)
dc.identifier.issn1799-4942 (electronic)
dc.identifier.issn1799-4934 (printed)
dc.identifier.issn1799-4934 (ISSN-L)
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/22499
dc.identifier.urnURN:ISBN:978-952-60-7037-7
dc.language.isoenen
dc.opnElko, Gary, Dr., mh acoustics, New Jersey, NY, USA
dc.publisherAalto Universityen
dc.publisherAalto-yliopistofi
dc.relation.haspart[Publication 1]: Archontis Politis, Ville Pulkki. Broadband Analysis and Synthesis for Directional Audio Coding using A-format Input Signals. In 131st Convention of the Audio Engineering Society, New York, NY, USA, October 2011.
dc.relation.haspart[Publication 2]: Archontis Politis, Tapani Pihlajamäki, Ville Pulkki. Parametric Spatial Audio Effects. In 15th International Conference on Digital Audio Effects (DAFx-12), York, UK, September 2012.
dc.relation.haspart[Publication 3]: Archontis Politis, Mikko-Ville Laitinen, Jukka Ahonen, Ville Pulkki. Parametric Spatial Audio Processing of Spaced Microphone Array Recordings for Multichannel Reproduction. Journal of the Audio Engineering Society, 63, 4, 216–227, April 2015. DOI: 10.17743/jaes.2015.0015
dc.relation.haspart[Publication 4]: Archontis Politis, Symeon Delikaris-Manias, Ville Pulkki. Direction-of-Arrival and Diffuseness Estimation Above Spatial Aliasing for Symmetrical Directional Microphone Arrays. In IEEE International Conference on Audio, Speech and Signal Processing, Brisbane, Australia, April 2015. DOI: 10.1109/ICASSP.2015.7177921
dc.relation.haspart[Publication 5]: Archontis Politis, Juha Vilkamo, Ville Pulkki. Sector-Based Parametric Sound Field Reproduction in the Spherical Harmonic Domain. IEEE Journal of Selected Topics in Signal Processing, 9, 5, 852–866, August 2015. DOI: 10.1109/JSTSP.2015.2415762
dc.relation.haspart[Publication 6]: Mikko-Ville Laitinen, Archontis Politis, Ilkka Huhtakallio, Ville Pulkki. Controlling the Perceived Distance of an Auditory Object by Manipulation of Loudspeaker Directivity. JASA Express Letters, 137, 6, EL462–EL468, June 2015. DOI: 10.1121/1.4921678
dc.relation.haspart[Publication 7]: Sakari Tervo, Archontis Politis. Direction of Arrival Estimation of Reflections from Room Impulse Responses Using a Spherical Microphone Array. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 23, 10, 1539–1551, October 2015. DOI: 10.1109/TASLP.2015.2439573
dc.relation.ispartofseriesAalto University publication series DOCTORAL DISSERTATIONSen
dc.relation.ispartofseries195/2016
dc.revZotter, Franz, Dr., University of Music and Performing Arts, Graz, Austria
dc.revPeters, Nils, Dr., Qualcomm, San Diego, CA, USA
dc.subject.keywordspatial audioen
dc.subject.keywordmicrophone arraysen
dc.subject.keywordsound field analysisen
dc.subject.keywordsound reproductionen
dc.subject.otherAcousticsen
dc.titleMicrophone array processing for parametric spatial audio techniquesen
dc.typeG5 Artikkeliväitöskirjafi
dc.type.dcmitypetexten
dc.type.ontasotDoctoral dissertation (article-based)en
dc.type.ontasotVäitöskirja (artikkeli)fi
local.aalto.archiveyes
local.aalto.formfolder2016_09_27_klo_12_31
Files
Original bundle
Now showing 1 - 8 of 8
No Thumbnail Available
Name:
isbn9789526070377.pdf
Size:
3 MB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
article1.pdf
Size:
827.92 KB
Format:
Adobe Portable Document Format
Description:
publisher's version
No Thumbnail Available
Name:
article2.pdf
Size:
382.96 KB
Format:
Adobe Portable Document Format
Description:
publisher's version
No Thumbnail Available
Name:
article3.pdf
Size:
543.91 KB
Format:
Adobe Portable Document Format
Description:
publisher's version
No Thumbnail Available
Name:
article4.pdf
Size:
505.29 KB
Format:
Adobe Portable Document Format
Description:
post print / author accepted version
No Thumbnail Available
Name:
article5.pdf
Size:
1.64 MB
Format:
Adobe Portable Document Format
Description:
post print / author accepted version
No Thumbnail Available
Name:
article6.pdf
Size:
537.24 KB
Format:
Adobe Portable Document Format
Description:
publisher's version
No Thumbnail Available
Name:
article7.pdf
Size:
2.06 MB
Format:
Adobe Portable Document Format
Description:
post print / author accepted version