Donate Speech: Collecting and Sharing a Large-Scale Speech Database for Social Sciences, Humanities and Artificial Intelligence Research and Innovation

Loading...
Thumbnail Image
Journal Title
Journal ISSN
Volume Title
A3 Kirjan tai muun kokoomateoksen osa
This publication is imported from Aalto University research portal.
View publication in the Research portal
View/Open full text file from the Research portal
Date
2022-10
Major/Subject
Mcode
Degree programme
Language
en
Pages
30
Series
CLARIN : the infrastructure for language resources
Abstract
The Donate Speech campaign aimed to collect 10 000 hours of ordinary, casual Finnish speech to be used for studying language as well as for developing technology and services that can be readily used in the languages spoken in Finland. In this project, particular attention has been paid to allowing for both academic and commercial use of the material. Even though the ambitious target currently seems to evade us, the Donate Speech campaign has managed to collect an extensive resource of more than 3500 h of Finnish colloquial speech with more than 200 000 speech recordings by roughly 50 000 speakers from all over Finland in just a few months.
Description
Keywords
speech resources, colloquial speech, large-scale data collection, academic and commercial use
Other note
Citation
Lindén , K , Jauhiainen , T , Lennes , M , Kurimo , M , Rossi , A , Kurki , T & Pitkänen , O 2022 , Donate Speech: Collecting and Sharing a Large-Scale Speech Database for Social Sciences, Humanities and Artificial Intelligence Research and Innovation . in CLARIN : the infrastructure for language resources . Digital Linguistics , vol. 1 , De Gruyter . https://doi.org/10.1515/9783110767377-019