Spherediar: An Effective Speaker Diarization System for Meeting Data
No Thumbnail Available
Access rights
openAccess
URL
Journal Title
Journal ISSN
Volume Title
A4 Artikkeli konferenssijulkaisussa
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
Authors
Date
2019-12-01
Major/Subject
Mcode
Degree programme
Language
en
Pages
8
373-380
373-380
Series
2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Proceedings
Abstract
In this paper, we present SphereDiar, a speaker diarization system composed of three novel subsystems: The Sphere-Speaker (SS) neural network, designed for speaker embedding extraction, a segmentation method called Homogeneity Based Segmentation (HBS) and a clustering algorithm called Top Two Silhouettes (Top2S). The system is evaluated on a set of over 200 manually transcribed multiparty meetings. The evaluation reveals that the system can be further simplified by omitting the use of HBS. Furthermore, we illustrate that SphereDiar achieves state-of-The-Art results with two different meeting data sets.Description
| openaire: EC/H2020/780069/EU//MeMAD
Keywords
segmentation, silhouette coefficients, speaker diarization, speaker embeddings, spherical K-means
Other note
Citation
Kaseva, T, Rouhe, A & Kurimo, M 2019, Spherediar: An Effective Speaker Diarization System for Meeting Data . in 2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Proceedings ., 9003967, IEEE, pp. 373-380, IEEE Automatic Speech Recognition and Understanding Workshop, Singapore, Singapore, 15/12/2019 . https://doi.org/10.1109/ASRU46091.2019.9003967