Out-of-distribution generalisation in spoken language understanding

Loading...
Thumbnail Image

Access rights

openAccess

URL

Journal Title

Journal ISSN

Volume Title

A4 Artikkeli konferenssijulkaisussa

Date

2024-09-05

Major/Subject

Mcode

Degree programme

Language

en

Pages

5

Series

Interspeech

Abstract

Test data is said to be out-of-distribution (OOD) when it unex- pectedly differs from the training data, a common challenge in real-world use cases of machine learning. Although OOD gen- eralisation has gained interest in recent years, few works have focused on OOD generalisation in spoken language understand- ing (SLU) tasks. To facilitate research on this topic, we intro- duce a modified version of the popular SLU dataset SLURP, featuring data splits for testing OOD generalisation in the SLU task. We call our modified dataset SLURP For OOD gener- alisation, or SLURPFOOD. Utilising our OOD data splits, we find end-to-end SLU models to have limited capacity for gen- eralisation. Furthermore, by employing model interpretability techniques, we shed light on the factors contributing to the gen- eralisation difficulties of the models. To improve the generali- sation, we experiment with two techniques, which improve the results on some, but not all the splits, emphasising the need for new techniques.

Description

Keywords

Other note

Citation

Porjazovski, D, Moisio, A & Kurimo, M 2024, Out-of-distribution generalisation in spoken language understanding . in Interspeech 2024 . Interspeech, International Speech Communication Association (ISCA), Interspeech, Kos Island, Greece, 01/09/2024 . https://doi.org/10.21437/Interspeech.2024-940