An Anonymization Tool for Open Data Publication of Legal Documents

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorOksanen, Arttuen_US
dc.contributor.authorHyvönen, Eeroen_US
dc.contributor.authorTamper, Minnaen_US
dc.contributor.authorTuominen, Jounien_US
dc.contributor.authorYlimaa, Hennaen_US
dc.contributor.authorLöytynoja, Katjaen_US
dc.contributor.authorKokkonen, Mattien_US
dc.contributor.authorHietanen, Akien_US
dc.contributor.departmentDepartment of Computer Scienceen
dc.contributor.groupauthorComputer Science Professorsen
dc.contributor.groupauthorComputer Science - Artificial Intelligence and Machine Learning (AIML)en
dc.contributor.groupauthorProfessorship Hyvönen Eeroen
dc.contributor.organizationStatistics Finlanden_US
dc.contributor.organizationMinistry of Justice, Finlanden_US
dc.date.accessioned2022-12-07T07:21:27Z
dc.date.available2022-12-07T07:21:27Z
dc.date.issued2022en_US
dc.descriptionPublisher Copyright: © 2022 Copyright for this paper by its authors.
dc.description.abstractThe EU General Data Protection Regulation (GDPR) requires anonymization of documents containing personal data, such as court decisions, for public use. Doing this manually is costly and time-consuming but can be automated by applying Natural Language Processing (NLP) methods. This paper introduces the ANOPPI tool developed for (semi-)automatic anonymization of Finnish texts. The tool can be used both as a web application and programmatically through a REST API. Evaluation shows that ANOPPI performs well with different types of documents, however, further improving the performance of the named entity recognition and disambiguation methods would enhance the usefulness of the software. The tool is being published as open source for public use by the Ministry of Justice in Finland. A use case of ANOPPI is to publish court decisions on the Web in the LawSampo semantic portal for human close reading and as Linked Open Data for data analysis in legal informatics.en
dc.description.versionPeer revieweden
dc.format.extent10
dc.format.extent12-21
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationOksanen, A, Hyvönen, E, Tamper, M, Tuominen, J, Ylimaa, H, Löytynoja, K, Kokkonen, M & Hietanen, A 2022, ' An Anonymization Tool for Open Data Publication of Legal Documents ', CEUR Workshop Proceedings, vol. 3257, pp. 12-21 .en
dc.identifier.issn1613-0073
dc.identifier.otherPURE UUID: 2bc500da-1a94-449c-8bda-a14383318033en_US
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/2bc500da-1a94-449c-8bda-a14383318033en_US
dc.identifier.otherPURE LINK: http://www.scopus.com/inward/record.url?scp=85142520225&partnerID=8YFLogxKen_US
dc.identifier.otherPURE LINK: https://ceur-ws.org/Vol-3257/en_US
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/93783096/An_Anonymization_Tool_for_Open_Data_Publication_of_Legal_Documents.pdfen_US
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/118026
dc.identifier.urnURN:NBN:fi:aalto-202212076771
dc.language.isoenen
dc.publisherRWTH Aachen University
dc.relation.ispartofseriesCEUR Workshop Proceedingsen
dc.relation.ispartofseriesVolume 3257en
dc.rightsopenAccessen
dc.subject.keywordanonymizationen_US
dc.subject.keywordcase lawen_US
dc.subject.keywordnamed entity recognitionen_US
dc.subject.keywordpseudonymizationen_US
dc.titleAn Anonymization Tool for Open Data Publication of Legal Documentsen
dc.typeA4 Artikkeli konferenssijulkaisussafi
dc.type.versionpublishedVersion

Files