Audiovisual matching of room acoustics in virtual reality
| dc.contributor | Aalto-yliopisto | fi |
| dc.contributor | Aalto University | en |
| dc.contributor.advisor | Meyer-Kahlen, Nils | |
| dc.contributor.author | Qiu, Lisha | |
| dc.contributor.school | Sähkötekniikan korkeakoulu | fi |
| dc.contributor.school | School of Electrical Engineering | en |
| dc.contributor.supervisor | Lokki, Tapio | |
| dc.date.accessioned | 2025-12-16T18:03:28Z | |
| dc.date.available | 2025-12-16T18:03:28Z | |
| dc.date.issued | 2025-11-24 | |
| dc.description.abstract | Ensuring audio-visual consistency is crucial for building a credible virtual reality (VR) and augmented reality (AR) experience, as perceptual inconsistency can weaken the sense of presence and reality. Although V-to-A matching has been studied in VR, where users adjust auditory rendering based on visual scenes, there are still many issues that have not been fully explored. This paper systematically studies the indoor acoustic matching performance under four experimental conditions: Visual-to-Audio (V-to-A), Audiovisual-to-Audio (AV-to-V), Audio-to-Audio (A-to-A), and Audio-to-visual (A-to-V) further address the current research gaps. A high-fidelity VR experiment was conducted using the measured spatial room impulse response and binaural audio presented by the panoramic visual scene. Our results demonstrate that matching performance improves significantly when acoustic references are available. In contrast, relying solely on V-to-A information leads to substantially poorer performance, whereas both AV-to-A and A-to-A conditions provide clear benefits. Furthermore, inconsistent source vision does not significantly undermine perceptual consistency. Although slight asymmetry was observed between the A-to-V and V-to-A tasks, it was not statistically significant. In terms of analysis, we determined that reverberation and clarity are the main auditory cues (loads on PC1) for perceiving the size and distance of space, while the secondary factor (PC2) is irrelevant in perception. This work provides a fundamental framework for designing a coherent spatial audio rendering system in immersive media, emphasizing the relative importance of auditory and visual references in achieving perceptual consistency in AR/VR. | en |
| dc.format.extent | 45 | |
| dc.format.mimetype | application/pdf | en |
| dc.identifier.uri | https://aaltodoc.aalto.fi/handle/123456789/141219 | |
| dc.identifier.urn | URN:NBN:fi:aalto-202512169328 | |
| dc.language.iso | en | en |
| dc.location | P1 | fi |
| dc.programme | Master's Programme in Computer, Communication and Information Sciences | en |
| dc.programme | Tieto-, tietoliikenne- ja informaatiotekniikan maisteriohjelma | fi |
| dc.programme | Magisterprogrammet i data-, informations- och kommunikationsteknik | sv |
| dc.programme.major | Acoustics and Audio Technology | en |
| dc.subject.keyword | audio-visual matching | en |
| dc.subject.keyword | VR | en |
| dc.subject.keyword | AR | en |
| dc.subject.keyword | psychoacoustics | en |
| dc.subject.keyword | room acoustics | en |
| dc.subject.keyword | virtual acoustics | en |
| dc.title | Audiovisual matching of room acoustics in virtual reality | en |
| dc.type | G2 Pro gradu, diplomityö | fi |
| dc.type.ontasot | Master's thesis | en |
| dc.type.ontasot | Diplomityö | fi |
| local.aalto.electroniconly | yes | |
| local.aalto.openaccess | no |