Extending the Finnish Linked Data Infrastructure with Natural Language Processing Services in FIN-CLARIAH

Loading...
Thumbnail Image
Journal Title
Journal ISSN
Volume Title
Conference article
Date
2022
Major/Subject
Mcode
Degree programme
Language
en
Pages
4
443-446
Series
CEUR Workshop Proceedings, Volume 3232
Abstract
The DARIAH-EU infrastructure for Digital Humanities (DH) is often focusing on using structured data for quantitative studies, while the EU-CLARIN infrastructure deals primarily with unstructured natural language texts. However, in DH research both texts and structured data are often needed. It therefore makes sense to develop and use both infrastructures together, as suggested in the Dutch CLARIAH programme and the corresponding FIN-CLARIAH initiative in Finland, a new part of the Finnish research infrastructure road map of the Academy of Finland. This poster paper introduces work in FIN-CLARIAH relating to the idea of integrating natural language processing (NLP) tools with the Linked Open Data (LOD) Infrastructure for Digital Humanities in Finland (LODI4DH). We present a plan for NLP services to be opened as part of the Linked Data Finland (LDF.fi) platform. The new services are used for knowledge extraction from Finnish texts for weaving LOD, and on the other hand for language DH data analyses of the published datasets in applications in many domains, such as political culture. The extended LDF.fi platform will provide users with documented APIs for NLP services using unified output formats as well as software delivery as Docker containers, to lower the bar for deployment.
Description
Funding Information: Our work is funded by the Academy of Finland as part of the FIN-CLARIAH program for national research infrastructures. CSC – IT Center for Science provides computational resources our project. Publisher Copyright: © 2022 Copyright for this paper by its author. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0)
Keywords
knowledge extraction, linked data, natural language processing
Other note
Citation
Tamper , M , Tuominen , J & Hyvönen , E 2022 , ' Extending the Finnish Linked Data Infrastructure with Natural Language Processing Services in FIN-CLARIAH ' , CEUR Workshop Proceedings , vol. 3232 , pp. 443-446 . < https://ceur-ws.org/Vol-3232//paper45.pdf >