Optimizing spherical loudspeaker array for voice directivity using the spherical cap model
| dc.contributor | Aalto-yliopisto | fi |
| dc.contributor | Aalto University | en |
| dc.contributor.advisor | Meyer-Kahlen, Nils | |
| dc.contributor.author | Palakkal Raghunath, Midhun | |
| dc.contributor.school | Sähkötekniikan korkeakoulu | fi |
| dc.contributor.school | School of Electrical Engineering | en |
| dc.contributor.supervisor | Arend, Johannes M. | |
| dc.date.accessioned | 2026-01-19T18:06:40Z | |
| dc.date.available | 2026-01-19T18:06:40Z | |
| dc.date.issued | 2025-12-31 | |
| dc.description.abstract | Conventional mouth simulators are limited by fixed radiation patterns, low power and high self-noise, failing to capture the complex spatial characteristics of real speech, such as the downward tilt of the main radiation lobe. To overcome these limitations, a simulation-driven approach is developed using the analytical spherical cap model, which was implemented and validated against known reference from the work of Aarts and Jansen [ 1 ]. This thesis investigates the optimization of a spherical loudspeaker array to accurately reproduce the dynamic and articulation-dependent directivity patterns of the human voice. A grid-search optimization method is applied to evaluate physical parameters, including array radius , driver radius , for different layouts (1, 4, 9, and 16 drivers), using a magnitude-weighted energy error metric. The single-driver configuration served as baseline proving the existence of optimal points, whereas they were found to be acoustically insufficient for replicating the asymmetric and time-varying nature of human speech. In contrast, by utilizing a frequency-dependent regularized least-squares control strategy (Tikhonov regularization) the multi-driver arrays successfully reproduce the higher-order spatial modes required to match measured phoneme patterns. The results indicate that a 16 driver configuration provides a critical threshold of perceptual authenticity, effectively reproducing the downward-tilted radiation lobes of vowels and the unique spatial signatures of fricatives and nasals. This work establishes a robust framework for developing physical prototypes. | en |
| dc.format.extent | 44 | |
| dc.format.mimetype | application/pdf | en |
| dc.identifier.uri | https://aaltodoc.aalto.fi/handle/123456789/142082 | |
| dc.identifier.urn | URN:NBN:fi:aalto-202601191458 | |
| dc.language.iso | en | en |
| dc.location | P1 | fi |
| dc.programme | Master's Programme in Computer, Communication and Information Sciences | en |
| dc.programme | Tieto-, tietoliikenne- ja informaatiotekniikan maisteriohjelma | fi |
| dc.programme | Magisterprogrammet i data-, informations- och kommunikationsteknik | sv |
| dc.programme.major | Acoustics and Audio Technology | en |
| dc.subject.keyword | spherical loudspeaker array | en |
| dc.subject.keyword | voice directivity | en |
| dc.subject.keyword | spherical cap model | en |
| dc.subject.keyword | regularization | en |
| dc.subject.keyword | phoneme-dependent radiation | en |
| dc.subject.keyword | ambisonics | en |
| dc.title | Optimizing spherical loudspeaker array for voice directivity using the spherical cap model | en |
| dc.type | G2 Pro gradu, diplomityö | fi |
| dc.type.ontasot | Master's thesis | en |
| dc.type.ontasot | Diplomityö | fi |
| local.aalto.electroniconly | yes | |
| local.aalto.openaccess | yes |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- master_Palakkal_Raghunath_Midhun_2026.pdf
- Size:
- 5.85 MB
- Format:
- Adobe Portable Document Format