Citation:
Tamper , M , Tuominen , J & Hyvönen , E 2022 , ' Extending the Finnish Linked Data Infrastructure with Natural Language Processing Services in FIN-CLARIAH ' , CEUR Workshop Proceedings , vol. 3232 , pp. 443-446 . < https://ceur-ws.org/Vol-3232//paper45.pdf >
|
Abstract:
The DARIAH-EU infrastructure for Digital Humanities (DH) is often focusing on using structured data for quantitative studies, while the EU-CLARIN infrastructure deals primarily with unstructured natural language texts. However, in DH research both texts and structured data are often needed. It therefore makes sense to develop and use both infrastructures together, as suggested in the Dutch CLARIAH programme and the corresponding FIN-CLARIAH initiative in Finland, a new part of the Finnish research infrastructure road map of the Academy of Finland. This poster paper introduces work in FIN-CLARIAH relating to the idea of integrating natural language processing (NLP) tools with the Linked Open Data (LOD) Infrastructure for Digital Humanities in Finland (LODI4DH). We present a plan for NLP services to be opened as part of the Linked Data Finland (LDF.fi) platform. The new services are used for knowledge extraction from Finnish texts for weaving LOD, and on the other hand for language DH data analyses of the published datasets in applications in many domains, such as political culture. The extended LDF.fi platform will provide users with documented APIs for NLP services using unified output formats as well as software delivery as Docker containers, to lower the bar for deployment.
|