Parametric spatial audio processing utilising compact microphone arrays

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorDelikaris-Manias, Symeon
dc.contributor.departmentSignaalinkäsittelyn ja akustiikan laitosfi
dc.contributor.departmentDepartment of Signal Processing and Acousticsen
dc.contributor.labCommunication Acoustics Groupen
dc.contributor.schoolSähkötekniikan korkeakoulufi
dc.contributor.schoolSchool of Electrical Engineeringen
dc.contributor.supervisorPulkki, Ville, Prof., Aalto University, Department of Signal Processing and Acoustics, Finland
dc.date.accessioned2017-11-03T10:02:42Z
dc.date.available2017-11-03T10:02:42Z
dc.date.defence2017-11-10
dc.date.issued2017
dc.description.abstractThis dissertation focuses on the development of novel parametric spatial audio techniques using compact microphone arrays. Compact arrays are of special interest since they can be adapted to fit in portable devices, opening the possibility of exploiting the potential of immersive spatial audio algorithms in our daily lives. The techniques developed in this thesis consider the use of signal processing algorithms adapted for human listeners, thus exploiting the capabilities and limitations of human spatial hearing. The findings of this research are in the following three areas of spatial audio processing: directional filtering, spatial audio reproduction, and direction of arrival estimation.  In directional filtering, two novel algorithms have been developed based on the cross-pattern coherence (CroPaC). The method essentially exploits the directional response of two different types of beamformers by using their cross-spectrum to estimate a soft masker. The soft masker provides a probability-like parameter that indicates whether there is sound present in specific directions. It is then used as a post-filter to provide further suppression of directionally distributed noise at the output of a beamformer. The performance of these algorithms represent a significant improvement over previous state-of-the-art methods.  In parametric spatial audio reproduction, an algorithm is developed for multi-channel loudspeaker and headphone rendering. Current limitations in spatial audio reproduction are related to high inter-channel coherence between the channels, which is common in signal-independent systems, or time-frequency artefacts in parametric systems. The developed algorithm focuses on solving these limitations by utilising two sets of beamformers. The first set of beamformers, namely analysis beamformers, is used to estimate a set of perceptually-relevant sound-field parameters, such as the separate channel energies, inter-channel time differences and inter-channel coherences of the target-output-setup signals. The directionality of the analysis beamformers is defined so that it follows that of typical loudspeaker panning functions and, for headphone reproduction, that of the head-related transfer functions (HRTFs). The directionality of the second set of high audio quality beamformers is then enhanced with the parametric information derived from the analysis beamformers. Listening tests confirm the perceptual benefit of such type of processing. In direction of arrival (DOA) estimation, histogram analysis of beamforming and active intensity based DOA estimators has been proposed. Numerical simulations and experiments with prototype and commercial microphone arrays show that the accuracy of DOA estimation is improved.en
dc.format.extent84 + app. 78
dc.format.mimetypeapplication/pdfen
dc.identifier.isbn978-952-60-7660-7 (electronic)
dc.identifier.isbn978-952-60-7661-4 (printed)
dc.identifier.issn1799-4942 (electronic)
dc.identifier.issn1799-4934 (printed)
dc.identifier.issn1799-4934 (ISSN-L)
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/28649
dc.identifier.urnURN:ISBN:978-952-60-7660-7
dc.language.isoenen
dc.opnJin, Craig, Prof., University of Sydney, Australia
dc.publisherAalto Universityen
dc.publisherAalto-yliopistofi
dc.relation.haspart[Publication 1]: Symeon Delikaris-Manias and Ville Pulkki. Cross pattern coherence algorithm for spatial filtering applications utilizing microphone arrays. IEEE Transactions on Audio, Speech, and Language Processing, Volume 21, issue 11, pages 2356–2367, November 2013. DOI: 10.1109/TASL.2013.2277928
dc.relation.haspart[Publication 2]: Symeon Delikaris-Manias, Juha Vilkamo, and Ville Pulkki. Signal-dependent spatial filtering based on weighted-orthogonal beamformers in the spherical harmonic domain. IEEE/ACM Transactions on Audio, Speech and Language Processing, Volume 24, issue 9, pages 1511 - 1523, April 2016. DOI: 10.1109/TASLP.2016.2560523
dc.relation.haspart[Publication 3]: Juha Vilkamo and Symeon Delikaris-Manias. Perceptual reproduction of spatial sound using loudspeaker-signal-domain parametrization. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Volume 23, issue 10, pages 1660–1669, June 2015. DOI: 10.1109/TASLP.2015.2443977
dc.relation.haspart[Publication 4]: Symeon Delikaris-Manias, Juha Vilkamo, and Ville Pulkki. Parametric binaural rendering utilising compact microphone arrays. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, pages 629–633, 19–24 April 2015. DOI: 10.1109/ICASSP.2015.7178045
dc.relation.haspart[Publication 5]: Symeon Delikaris-Manias, Despoina Pavlidi, Ville Pulkki, and Athanasios Mouchtaris. 3D localization of multiple audio sources utilizing 2D DOA histograms. In 24th European Signal Processing Conference (EUSIPCO), Budapest, Hungary, pages 1473–1477, 29 August-2 September 2016. DOI: 10.1109/EUSIPCO.2016.7760493
dc.relation.haspart[Publication 6]: Symeon Delikaris-Manias, Despoina Pavlidi, Athanasios Mouchtaris, and Ville Pulkki. DOA estimation with histogram analysis of spatially constrained intensity vectors. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, USA, pages 526-530, 5–9 March 2017. DOI: 10.1109/ICASSP.2017.7952211
dc.relation.haspart[Publication 7]: Leo McCormack, Symeon Delikaris-Manias, and Ville Pulkki. Parametric acoustic camera for real-time sound capture, analysis and tracking. In International Conference on Digital Audio Effects (DAFx-17), Edinburgh, UK, 5–9 September 2017.
dc.relation.ispartofseriesAalto University publication series DOCTORAL DISSERTATIONSen
dc.relation.ispartofseries197/2017
dc.revTashev, Ivan, Prof., Microsoft Research, USA
dc.revSouden, Mehrez, Dr., Apple Inc., USA
dc.subject.keywordspatial audioen
dc.subject.keyworddirectional filteringen
dc.subject.keywordperceptual sound reproductionen
dc.subject.keywordmicrophone arraysen
dc.subject.otherAcousticsen
dc.titleParametric spatial audio processing utilising compact microphone arraysen
dc.typeG5 Artikkeliväitöskirjafi
dc.type.dcmitypetexten
dc.type.ontasotDoctoral dissertation (article-based)en
dc.type.ontasotVäitöskirja (artikkeli)fi
local.aalto.archiveyes
local.aalto.formfolder2017_11_02_klo_14_31

Files

Original bundle

Now showing 1 - 8 of 8
No Thumbnail Available
Name:
isbn9789526076607.pdf
Size:
4.92 MB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
article1.pdf
Size:
2.77 MB
Format:
Adobe Portable Document Format
Description:
post print / author accepted version
No Thumbnail Available
Name:
article2.pdf
Size:
1.64 MB
Format:
Adobe Portable Document Format
Description:
post print / author accepted version
No Thumbnail Available
Name:
article3.pdf
Size:
2.21 MB
Format:
Adobe Portable Document Format
Description:
post print / author accepted version
No Thumbnail Available
Name:
article4.pdf
Size:
462.01 KB
Format:
Adobe Portable Document Format
Description:
post print / author accepted version
No Thumbnail Available
Name:
article5.pdf
Size:
1.09 MB
Format:
Adobe Portable Document Format
Description:
post print / author accepted version
No Thumbnail Available
Name:
article6.pdf
Size:
900.34 KB
Format:
Adobe Portable Document Format
Description:
post print / author accepted version
No Thumbnail Available
Name:
article7.pdf
Size:
7.25 MB
Format:
Adobe Portable Document Format
Description:
Final published version