Donate Speech: Collecting and Sharing a Large-Scale Speech Database for Social Sciences, Humanities and Artificial Intelligence Research and Innovation

Loading...
Thumbnail Image

Access rights

openAccess
publishedVersion

URL

Journal Title

Journal ISSN

Volume Title

A3 Kirjan tai muun kokoomateoksen osa

Date

2022-10

Major/Subject

Mcode

Degree programme

Language

en

Pages

30

Series

Digital Linguistics ; Volume 1

Abstract

The Donate Speech campaign aimed to collect 10 000 hours of ordinary, casual Finnish speech to be used for studying language as well as for developing technology and services that can be readily used in the languages spoken in Finland. In this project, particular attention has been paid to allowing for both academic and commercial use of the material. Even though the ambitious target currently seems to evade us, the Donate Speech campaign has managed to collect an extensive resource of more than 3500 h of Finnish colloquial speech with more than 200 000 speech recordings by roughly 50 000 speakers from all over Finland in just a few months.

Description

Keywords

speech resources, colloquial speech, large-scale data collection, academic and commercial use

Other note

Citation

Lindén, K, Jauhiainen, T, Lennes, M, Kurimo, M, Rossi, A, Kurki, T & Pitkänen, O 2022, Donate Speech: Collecting and Sharing a Large-Scale Speech Database for Social Sciences, Humanities and Artificial Intelligence Research and Innovation . in CLARIN : the infrastructure for language resources . Digital Linguistics, vol. 1, De Gruyter . https://doi.org/10.1515/9783110767377-019