Title: | Approaching human performance in noise robust automatic speech recognition Kohti ihmiskykyjä melusietoisessa automaattisessa puheentunnistuksessa |
Author(s): | Keronen, Sami |
Date: | 2014 |
Language: | en |
Pages: | 65 + app. 59 |
Department: | Signaalinkäsittelyn ja akustiikan laitos Department of Signal Processing and Acoustics |
Supervising professor(s): | Kurimo, Mikko |
Thesis advisor(s): | Palomäki, Kalle |
Subject: | Acoustics, Electrical engineering |
Keywords: | noise robust, speech recognition, mask estimation, linear prediction, melusietoinen, puheentunnistus, maskiestimointi, lineaariprediktio, GRBM |
OEVS yes | |
|
|
Abstract:Nykyaikaiset automaattiset puheentunnistusjärjestelmät pystyvät tunnistamaan luettua puhetta vähämeluisissa käyttöympäristöissä lähes yhtä tarkasti kuin ihmiset, mutta kovassa taustamelussa ihmisen tunnistuskyky on huomattavasti konetta tehokkaampi. Tutkimuksessa esitetään menetelmiä puhesignaalin melusietoiseen spektrianalyysiin ja puuttuvan datan maskiestimointiin automaattisen puheentunnistuksen parantamiseksi melutyypiltään vaihtelevissa ja alhaisen signaalikohinasuhteen käyttöympäristöissä. |
|
Parts:[Publication 1]: Sami Keronen, Ulpu Remes, Kalle J. Palomäki, Tuomas Virtanen and Mikko Kurimo. Comparison of Noise Robust Methods in Large Vocabulary Speech Recognition. In EUSIPCO 2010 – The 18th European Signal Processing Conference, Aalborg, Denmark, pp. 1973–1977, August 2010. (Full text not included in the electronic version of the thesis).[Publication 2]: Sami Keronen, Jouni Pohjalainen, Paavo Alku and Mikko Kurimo. Noise Robust Feature Extraction Based on Extended Weighted Linear Prediction in LVCSR. In INTERSPEECH 2011 – Proceedings of the 12th Annual Conference of the International Speech Communication Association, Florence, Italy, pp. 1265–1268, August 2011. (Full text not included in the electronic version of the thesis).[Publication 3]: Heikki Kallasjoki, Sami Keronen, Guy J. Brown, Jort F. Gemmeke, Ulpu Remes and Kalle J. Palomäki. Mask Estimation and Sparse Imputation for Missing Data Speech Recognition in Multisource Reverberant Environments. In CHiME – International Workshop on Machine Listening in Multisource Environments, Florence, Italy, pp. 58–63, September 2011. (Full text not included in the electronic version of the thesis).[Publication 4]: Sami Keronen, Heikki Kallasjoki, Ulpu Remes, Guy J. Brown, Jort F. Gemmeke and Kalle J. Palomäki. Mask Estimation and Imputation Methods for Missing Data Speech Recognition in a Multisource Reverberant Environment. Computer Speech and Language, vol. 27 no. 3, pp. 798–819, February 2013. (Full text not included in the electronic version of the thesis).[Publication 5]: Sami Keronen, KyungHyun Cho, Tapani Raiko, Alexander Ilin and Kalle J. Palomäki. Gaussian-Bernoulli Restricted Boltzmann Machines and Automatic Feature Extraction for Noise Robust Missing Data Mask Estimation. In ICASSP 2013 – The 38th International Conference on Acoustics, Speech, and Signal Processing, Vancouver, Canada, pp. 6729-6733, May 2013. (Full text not included in the electronic version of the thesis).[Publication 6]: Sami Keronen, Ulpu Remes, Heikki Kallasjoki and Kalle J. Palomäki. Noise Robust Missing Data Mask Estimation Based on Automatically Learned Features. In CHiME 2013 – The 2nd International Workshop on Machine Listening in Multisource Environments, Vancouver, Canada, pp. 77–78, June 2013. (Full text not included in the electronic version of the thesis). |
|
|
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Page content by: Aalto University Learning Centre | Privacy policy of the service | About this site