The I4U mega fusion and collaboration for NIST speaker recognition evaluation 2016
Loading...
Journal Title
Journal ISSN
Volume Title
A4 Artikkeli konferenssijulkaisussa
This publication is imported from Aalto University research portal.
View publication in the Research portal
View/Open full text file from the Research portal
Other link related to publication
View publication in the Research portal
View/Open full text file from the Research portal
Other link related to publication
Date
2017
Department
Agency for Science, Technology and Research
University of Eastern Finland
Université du Maine
University of Texas at Austin
Darmstadt University of Applied Sciences
University of Nottingham
Avignon Université
Nanyang Technological University
ValidSoft
University of New South Wales
Hong Kong Polytechnic University
Aalborg University
EURECOM
Dept Signal Process and Acoust
IBM
Alibaba Group Inc.
University of Eastern Finland
Université du Maine
University of Texas at Austin
Darmstadt University of Applied Sciences
University of Nottingham
Avignon Université
Nanyang Technological University
ValidSoft
University of New South Wales
Hong Kong Polytechnic University
Aalborg University
EURECOM
Dept Signal Process and Acoust
IBM
Alibaba Group Inc.
Major/Subject
Mcode
Degree programme
Language
en
Pages
5
1328-1332
1328-1332
Series
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Volume 2017-August, Interspeech: Annual Conference of the International Speech Communication Association
Abstract
The 2016 speaker recognition evaluation (SRE'16) is the latest edition in the series of benchmarking events conducted by the National Institute of Standards and Technology (NIST). I4U is a joint entry to SRE'16 as the result from the collaboration and active exchange of information among researchers from sixteen Institutes and Universities across 4 continents. The joint submission and several of its 32 sub-systems were among top-performing systems. A lot of efforts have been devoted to two major challenges, namely, unlabeled training data and dataset shift from Switchboard-Mixer to the new Call My Net dataset. This paper summarizes the lessons learned, presents our shared view from the sixteen research groups on recent advances, major paradigm shift, and common tool chain used in speaker recognition as we have witnessed in SRE'16. More importantly, we look into the intriguing question of fusing a large ensemble of sub-systems and the potential benefit of large-scale collaboration.Description
Keywords
Benchmark, Call My Net, Fusion, Speaker recognition evaluation
Citation
Lee , K A , Hautamäki , V , Kinnunen , T , Larcher , A , Zhang , C , Nautsch , A , Stafylakis , T , Rouvier , M , Rao , W , Alegre , F , Ma , J , Mak , M W , Sarkar , A K , Delgado , H , Saeidi , R , Aronowitz , H , Sizov , A , Sun , H , Nguyen , T H , Wang , G , Ma , B , Vestman , V , Sahidullah , M , Halonen , M , Kanervisto , A , Le Lan , G , Bahmaninezhad , F , Isadskiy , S , Rathgeb , C , Busch , C , Tzimiropoulos , G , Qian , Q , Wang , Z , Zhao , Q , Wang , T , Li , H , Xue , J , Zhu , S , Jin , R , Zhao , T , Bousquet , P M , Ajili , M , Kheder , W B , Matrouf , D , Lim , Z H , Xu , C , Xu , H , Xiao , X , Chng , E S , Fauve , B , Sriskandaraja , K , Sethu , V , Thomsen , D A L , Tan , Z H , Todisco , M , Evans , N , Li , H , Hansen , J H L , Bonastre , J F , Ambikairajah , E , Liu , G & Lin , W 2017 , The I4U mega fusion and collaboration for NIST speaker recognition evaluation 2016 . in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH . vol. 2017-August , Interspeech: Annual Conference of the International Speech Communication Association , International Speech Communication Association (ISCA) , pp. 1328-1332 , Interspeech , Stockholm , Sweden , 20/08/2017 . https://doi.org/10.21437/Interspeech.2017-203