Implementing and testing the performance of deep neural network speaker verification models

No Thumbnail Available

URL

Journal Title

Journal ISSN

Volume Title

Perustieteiden korkeakoulu | Master's thesis

Date

2023-01-23

Department

Major/Subject

Machine Learning, Data Science and Artificial Intelligence

Mcode

SCI3044

Degree programme

Master’s Programme in Computer, Communication and Information Sciences

Language

en

Pages

73

Series

Abstract

Speaker verification is a subtask of speaker recognition that employs speech, the most natural way of communication, as a form of biometric analysis. For this, a system extracts and models the characteristic features of speaker voices from their speech signals. This verification is an essential tool in many applications, ranging from law enforcement to voice-controlled smart assistants (e.g., Siri) that are currently widespread in our daily lives. However, speech contains a large degree of variability from different sources that can severely degrade the performance of these systems. Thus, current developments have been focused on subduing these issues thanks to the creation of large datasets tailored for speaker recognition and the advances in deep learning that have significantly boosted performance. Specifically, deep speaker embeddings are a successful technique to represent a speaker using a fixed-dimensional feature vector. This thesis focuses on implementing two speaker verification systems that extract deep speaker embeddings using deep neural networks and an advanced objective function. Moreover, the models are analyzed using various test sets, such as in "in the wild" environments or employing unseen languages, specifically Finnish. The experiments demonstrated the excellent generalization ability and robustness of the models against adverse conditions and their capacity to be language-agnostic.

Description

Supervisor

Kurimo, Mikko

Thesis advisor

Virkkunen, Anja

Keywords

speaker recognition, speaker verification, deep learning, x-vector, ECAPA-TDNN

Other note

Citation