Comparison between conventional and end-to-end spoken language understanding models
No Thumbnail Available
Files
Holmberg_Oskar_2024.pdf (521.73 KB) (opens in new window)
Aalto login required (access for Aalto Staff only).
URL
Journal Title
Journal ISSN
Volume Title
Perustieteiden korkeakoulu |
Bachelor's thesis
Electronic archive copy is available locally at the Harald Herlin Learning Centre. The staff of Aalto University has access to the electronic bachelor's theses by logging into Aaltodoc with their personal Aalto user ID. Read more about the availability of the bachelor's theses.
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Authors
Date
2024-09-04
Department
Major/Subject
Tietotekniikka
Mcode
SCI3027
Degree programme
Teknistieteellinen kandidaattiohjelma
Language
en
Pages
13
Series
Abstract
Spoken language understanding (SLU) plays an important role in enabling natural interaction between humans and machines, particularly with the increase in popularity of personal assistants. This thesis explores and compares two main approaches to developing SLU systems: conventional and end-to-end. The conventional method involves separate modules for automatic speech recognition (ASR) and natural language understanding (NLU), while the end-to-end approach aims to directly learn from audio input. By examining modern research, this thesis evaluates the advantages and disadvantages of each method, addressing questions such as accuracy rates and readiness for large-scale implementation.Spoken language understanding (SLU) spelar en viktig roll för att möjliggöra en naturlig ïnteraktion mellan människor och maskiner, särskillt nu när populariteten av personliga assistenter som google home ökar. Denna avhandling utforskar och jämför två huvudsakliga tillvägagångssätt för att utveckla SLU-system: konventionell och end-to-end. Den konventionella metoden innefattar separata moduler för automatisk taligenkänning och naturlig språkförståelse, medan end-to-end-metoden syftar till att direkt lära av ljudinmatning. Genom att undersöka modern forskning utvärderar denna avhandling fördelarna och nackdelarna med varje metod, och tar upp frågor som noggrannhet och beredskap för storskalig implementering.Description
Supervisor
Savioja, LauriThesis advisor
Porjazovski, DejanKeywords
spoken language understanding, end-to-end, intent classification