Semi-supervised learning using pseudo-labels: A case study in Northern Sámi ASR

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

School of Electrical Engineering | Master's thesis

Department

Mcode

Language

en

Pages

50

Series

Abstract

While Self-Supervised Learning has advanced Automatic Speech Recognition (ASR) for low-resource languages, limited labeled data remains a bottleneck, often preventing reasonable downstream performance. In line with this requirement, Semi-Supervised Learning approaches have gained prominence as providing a means to data-efficient methods that can be helpful, even with limited supervised data. This work investigates the usefulness of semi-supervised learning, specifically Pseudo-Labeling with self-supervised speech models for Northern Sámi, an under-resourced agglutinative language. Within a teacher–student framework, pseudo-labels are incorporated using two strategies: (a) filtering them through WER-based agreement between the models, and (b) using all pseudo-labels generated by a teacher model. Results show that pseudo-labels from the larger teacher foundation model can improve the downstream performance of a smaller one without additional labeled data, yielding 33.30–3.84% relative character error rate reduction (CERR) across three out-of-distribution test sets. The smaller student model is evaluated in a supervised setting (20 hours of labeled data) and semi-supervised settings with 20, 26 and 75 hours of pseudo-labeled data. Notably, the student model can benefit as much, or even more, from pseudo-labels as from labeled data. Further gains can be achieved when both annotated and pseudo-labels are combined in a two-stage fine-tuning process, rather than mixing all data in a single training phase.

Description

Supervisor

Kurimo, Mikko

Thesis advisor

Grósz, Tamás
Getman, Yaroslav

Other note

Citation