Browsing by Author "Hold, Christoph"
Now showing 1 - 12 of 12
Results Per Page
Sort Options
Item Auralization of Measured Room Transitions in Virtual Reality(Audio Engineering Society, 2023-06-03) McKenzie, Thomas; Meyer-Kahlen, Nils; Hold, Christoph; Schlecht, Sebastian J.; Pulkki, Ville; Department of Art and Media; Department of Information and Communications Engineering; Virtual Acoustics; Communication Acoustics: Spatial Sound and PsychoacousticsTo auralize a room’s acoustics in six degrees-of-freedom virtual reality (VR), a dense set of spatial room impulse response (SRIR) measurements is required, so interpolating between a sparse set is desirable. This paper studies the auralization of room transitions by proposing a baseline interpolation method for higher-order Ambisonic SRIRs and evaluating it in VR. The presented method is simple yet applicable to coupled rooms and room transitions. It is based on linear interpolation with RMS compensation, although direct sound, early reflections, and late reverberation are processed separately, whereby the input direct sounds are first steered to the relative direction-of-arrival before summation and interpolated early reflections are directionally equalized. The proposed method is first evaluated numerically, which demonstrates its improvements over a basic linear interpolation. A listening test is then conducted in six degrees-of-freedom VR, to assess the density of SRIR measurements needed in order to plausibly auralize a room transition using the presented interpolation method. The results suggest that, given the tested scenario, a 50-cm to 1-m inter-measurement distance can be perceptually sufficient.Item Compression of Higher-Order Ambisonic Signals using Directional Audio Coding(IEEE, 2024) Hold, Christoph; Pulkki, Ville; Politis, Archontis; McCormack, Leo; Department of Information and Communications Engineering; Communication Acoustics: Spatial Sound and Psychoacoustics; Virtual AcousticsDelivering high-quality spatial audio in the Ambisonics format requires extensive data bandwidth, which may render it inaccessible for many low-bandwidth applications. Existing widely-available multi-channel audio compression codecs are not designed to consider the characteristic inter-channel relations inherent to the Ambisonics format, and thus may not leverage this knowledge to optimise the compression. Therefore, this article proposes a spatial audio compression algorithm, based on a novel reformulation of the Higher-Order Directional Audio Coding (HO-DirAC) method, which is specifically intended for compressing higher-order Ambisonic audio streams. The methodology builds upon the concept of a spherical filter bank acting in the spherical harmonic domain. This results in directionally constrained sound-field estimates and parameterization, which may be utilized to reconstruct the input Ambisonic signals with minimal perceived loss of quality. The results of a listening experiment indicate high perceptual quality when using six or more audio transport channels to deliver fifth-order (36 channels) Ambisonic sound scenes. The proposed formulation is also designed with low computational complexity in mind and may therefore be well suited for compressing Ambisonic sound scenes for a wide range of applications.Item Magnitude-Least-Squares Binaural Ambisonic Rendering with Phase Continuation(2023-03-31) Hold, Christoph; Meyer-Kahlen, Nils; Pulkki, Ville; Department of Information and Communications Engineering; Communication Acoustics: Spatial Sound and Psychoacoustics; Virtual AcousticsBinaural rendering of Ambisonic signals is one of the most accessible ways of experiencing spatial audio. However, due to technical constraints, the rendering algorithm needs special care and advanced signal processing, especially for low Ambisonic orders. Next to more intricate parametric model-based approaches, other computationally efficient algorithms have emerged that provide powerful options. One particularly effective technique is the idea to discard the phase of HRTFs above a certain frequency limit, where the auditory system is less sensitive to phase information and instead utilize the available low order resolution to achieve an optimized magnitude response. This technique is known as the magnitude-least-squares (magLS) binaural rendering algorithm and often implemented as a recursive solution over frequency bins. However, altering the phase can lead to group delay errors, and therefore, frequency dependent misalignment, i.e. dispersion, of the HRIR. This issue is particularly prevalent with measurements that show a significant pre-delay, such as linear phase HRTFs. Besides analyzing the phase behavior of magLS, we present an effective way to preserve the group delay by continuing the phase over frequency as observed in the lower frequency region unaffected by the phase modification. This simple modification leads to further improvements for the magLS binaural Ambisonics rendering.Item Object-Based Six-Degrees-of-Freedom Rendering of Sound Scenes Captured with Multiple Ambisonic Receivers(Audio Engineering Society, 2022-05) McCormack, Leo; Politis, Archontis; McKenzie, Thomas; Hold, Christoph; Pulkki, Ville; Dept Signal Process and Acoust; Communication Acoustics: Spatial Sound and PsychoacousticsThis article proposes a system for object-based six-degrees-of-freedom (6DoF) rendering of spatial sound scenes that are captured using a distributed arrangement of multiple Ambisonic receivers. The approach is based on first identifying and tracking the positions of sound sources within the scene, followed by the isolation of their signals through the use of beamformers. These sound objects are subsequently spatialized over the target playback setup, with respect to both the head orientation and position of the listener. The diffuse ambience of the scene is rendered separately by first spatially subtracting the source signals from the receivers located nearest to the listener position. The resultant residual Ambisonic signals are then spatialized, decorrelated, and summed together with suitable interpolation weights. The proposed system is evaluated through an in situ listening test conducted in 6DoF virtual reality, whereby real-world sound sources are compared with the auralization achieved through the proposed rendering method. The results of 15 participants suggest that in comparison to a linear interpolation-based alternative, the proposed object-based approach is perceived as being more realistic.Item Optimizing Higher-Order Directional Audio Coding with Adaptive Mixing and Energy Matching for Ambisonic Compression and Upmixing(2023) Hold, Christoph; McCormack, Leo; Politis, Archontis; Pulkki, Ville; Department of Information and Communications Engineering; Communication Acoustics: Spatial Sound and Psychoacoustics; Virtual AcousticsIn order to transmit sound-scenes encoded into the higher-order Ambisonics (HOA) format to low-bandwidth devices, transmission codecs are needed to reduce data requirements. Recently, the model-based higher-order directional audio coding (HO-DirAC) method was formulated for HOA input to HOA output. Compression is achieved by reducing the number of audio transport channels through spatial discretization. These transport channels are then used to reconstruct the scene on the receiving end based on accompanying spatial metadata. This reconstructed scene may also be optionally upmixed to a higher-order; leading to an enhancement in spatial-resolution. In this paper, the authors analyze certain sound-scenes that were especially challenging for the previously proposed HO-DirAC framework, which the authors postulate could be attributed to the lower-order reconstruction of diffuse sound-field components. Three optimizations for HO-DirAC are proposed, which all employ optimal adaptive mixing and/or energy matching of Ambisonic components based on spatial covariance matrices. The methods are formulated such that they are applied directly in the reconstruction of HOA from the spatially discrete transport audio signals. Notably, a dedicated low-complexity solution without additional side-information is derived. Instrumental evaluations confirm a reduced reconstruction error when using either of the proposed optimizations. These improvements were also demonstrated via a perceptual evaluation, whereby four, six, and twelve transport channels were used to reconstruct (and upmix to) fifth-order reference sound-scenes. The evaluation highlighted the high perceptual performance of the proposed optimizations, including the low-complexity version, thereby improving parametric spatial audio coding and reproduction.Item Parametric Ambisonic Encoding using a Microphone Array with a One-plus-Three Configuration(2022) McCormack, Leo; Gonzalez, Raimundo; Fernandez, Janani; Hold, Christoph; Politis, Archontis; Dept Signal Process and Acoust; Department of Neuroscience and Biomedical Engineering; Communication Acoustics: Spatial Sound and Psychoacoustics; Virtual Acoustics; Lokki Tapio groupA parametric signal-dependent method is proposed for the task of encoding a studio omnidirectional microphone signal into the Ambisonics format. This is realised by affixing three additional sensors to the surface of the cylindrical microphone casing; representing a practical solution for imparting spatial audio recording capabilities onto an otherwise non-spatial audio compliant microphone. The one-plus-three configuration and parametric encoding method were evaluated through formal listening tests using simulated sound scenes and array recordings, given a binaural decoding workflow. The results indicate that, when compared to employing first-order signals obtained linearly using an open tetrahedral array, or third-order signals derived from a 19-sensor spherical array, the proposed system is able to produce perceptually closer renderings to those obtained using ideal third-order signals.Item Recording and reproduction of 6DOF audio(2023-12-20) Puttonen, Toni; Hold, Christoph; Sähkötekniikan korkeakoulu; Turunen, MarkusItem Resynthesis of Spatial Room Impulse Response Tails With Anisotropic Multi-Slope Decays(Audio Engineering Society, 2022-06) Hold, Christoph; McKenzie, Thomas; Götz, Georg; Schlecht, Sebastian; Pulkki, Ville; Dept Signal Process and Acoust; Department of Art and Media; Communication Acoustics: Spatial Sound and PsychoacousticsSpatial room impulse responses (SRIRs) capture room acoustics with directional information. SRIRs measured in coupled rooms and spaces with non-uniform absorption distribution may exhibit anisotropic reverberation decays and multiple decay slopes. However, noisy measurements with low signal-to-noise ratios pose issues in analysis and reproduction in practice. This paper presents a method for resynthesis of the late decay of anisotropic SRIRs, effectively removing noise from SRIR measurements. The method accounts for both multi-slope decays and directional reverberation. A spherical filter bank extracts directionally constrained signals from Ambisonic input, which are then analyzed and parameterized in terms of multiple exponential decays and a noise floor. The noisy late reverberation is then resynthesized from the estimated parameters using modal synthesis, and the restored SRIR is reconstructed as Ambisonic signals. The method is evaluated both numerically and perceptually, which shows that SRIRs can be denoised with minimal error as long as parts of the decay slope are above the noise level, with signal-to-noise ratios as low as 40 dB in the presented experiment. The method can be used to increase the perceived spatial audio quality of noise-impaired SRIRs.Item Sector-Based Encoding and Data Compression of Virtual Acoustic Scattering(2022) Gonzalez, Raimundo; Hold, Christoph; Lokki, Tapio; Politis, Archontis; Dept Signal Process and Acoust; Virtual Acoustics; Lokki Tapio group; Communication Acoustics: Spatial Sound and PsychoacousticsIn order to produce high fidelity representations of sound-fields in spatial audio applications, the acoustical nuances occurring within physicals spaces must be included. This includes the effects of scattering from the boundary surfaces of enclosed spaces, as well as the scattering from finite bodies within spaces. Recently, a method has been proposed to encode the properties of arbitrary scattering geometries into the spherical harmonic domain in the form of a scattering expansion matrix. The following study extends this previous method by providing a sector-based approach. This approach allows for the encoding of selective regions of the scattering geometry, such as the upper hemisphere. Furthermore, a second optimization is also proposed to compress and reduce the memory storage of the scattering expansion matrix. Two methods for compression are proposed, with one of them providing minimal loss of fidelity by means of the Singular Value Decomposition.Item Spatial Filter Bank Design in the Spherical Harmonic Domain(2021-12) Hold, Christoph; Politis, Archontis; McCormack, Leo; Pulkki, Ville; Dept Signal Process and Acoust; Communication Acoustics: Spatial Sound and PsychoacousticsA fairly recent development in spatial audio is the concept of dividing a spherical sound field into several directionally-constrained regions, or sectors. Therefore, the sphere is spatially partitioned into components that should ideally reconstruct the unit sphere. When distributing such sectors uniformly on the sphere, their set makes up a bank of spatial filters, i. e. a spatial filter bank. These sectors can be conveniently designed in the spherical harmonic domain such that each sector preserves the local properties of the acoustic energy-density. These traits have enabled recent improvements in the parameterization of higher-order Ambisonics, e. g. for spatial audio reproduction, multi-source analysis, and sound field visualization. However, when using a set of these sectors as a spatial filter bank, their spatial interaction incurs a scaling error if the reconstructed sound field is not properly compensated. This paper presents the methodology for designing a set of spatial filters in the spherical harmonic domain, which uniformly partition the sphere. Furthermore, a new corresponding compensation factor is derived enabling amplitude or energy preservation of the input sound field. This allows the implementation of a novel spatial filter bank in the spherical harmonic domain.Item Spatial Filter Bank in the Spherical Harmonic Domain: Reconstruction and Application(2021-12) Hold, Christoph; Schlecht, Sebastian; Politis, Archontis; Pulkki, Ville; Dept Signal Process and Acoust; Department of Media; Communication Acoustics: Spatial Sound and PsychoacousticsFilter banks are an integral part of modern signal processing. They may also be applied to spatial filtering and the employed spatial filters can be designed with a specific shape for the analysis, e. g. suppressing side-lobes. After extracting spatially constrained signals from spherical harmonic (SH) input, i. e. filter bank analysis, many applications demand for a re-synthesis of the associated sector signals to the SH domain. This paper hence derives the complementary spatial filter bank reconstruction. The criterion for perfect reconstruction, and energy preserving reconstruction are given and implemented into the design. The filter bank is formulated such that for axisymmetric patterns both criteria can be met by only minor modification to the reconstruction stage. Its application is then demonstrated for both scenarios, perfect reconstruction and energy preservation of SH input signals.Item Visualization of Impulse Responses using Parametric Spatial Audio Techniques(2024-08-19) Talwadker, Vaibhav; Hold, Christoph; Sähkötekniikan korkeakoulu; Pulkki, VilleThe aim of the thesis was to analyze spatial room impulse responses (SRIR) with the spatial decomposition method (SDM) using higher order impulse responses by adopting the idea of sector processing from higher order DirAC. The idea was to study scenarios in which sector processing would prove to be beneficial in localizing image sources accurately. Tests were conducted on simulated impulse responses using the image source method as well as on an actual higher-order SRIR dataset recorded using the Eigenmike spherical microphone array. Since first order ambisonics suffers from poor spatial resolution, higher order ambisonic signals are preferred as the recorded sound field can be spatially segregated into sectors. While sectors have been used in earlier methods relying on DoA estimation using narrowband intensity vectors, their efficacy in broadband intensity-based DoA estimation has not been examined. Thus, this thesis studies the image source localization results obtained by combining sector processing with SDM.