Cognate-aware morphological segmentation for multilingual neural translation

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorGrönroos, Stig-Arneen_US
dc.contributor.authorVirpioja, Samien_US
dc.contributor.authorKurimo, Mikkoen_US
dc.contributor.departmentDept Signal Process and Acousten
dc.contributor.groupauthorCentre of Excellence in Computational Inference, COINen
dc.contributor.groupauthorSpeech Recognitionen
dc.date.accessioned2019-01-14T09:21:53Z
dc.date.available2019-01-14T09:21:53Z
dc.date.issued2018-10-31en_US
dc.description| openaire: EC/H2020/780069/EU//MeMAD
dc.description.abstractThis article describes the Aalto University entry to the WMT18 News Translation Shared Task. We participate in the multilingual subtrack with a system trained under the constrained condition to translate from English to both Finnish and Estonian. The system is based on the Transformer model. We focus on improving the consistency of morphological segmentation for words that are similar orthographically, semantically, and distributionally; such words include etymological cognates, loan words, and proper names. For this, we introduce Cognate Morfessor, a multilingual variant of the Morfessor method. We show that our approach improves the translation quality particularly for Estonian, which has less resources for training the translation model.en
dc.description.versionPeer revieweden
dc.format.extent8
dc.format.extent390-397
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationGrönroos, S-A, Virpioja, S & Kurimo, M 2018, Cognate-aware morphological segmentation for multilingual neural translation . in Third Conference on Machine Translation (WMT18); Brussels, Belgium . Association for Computational Linguistics, pp. 390-397, Conference on Machine Translation, Brussels, Belgium, 31/10/2018 . https://doi.org/10.18653/v1/W18-64037en
dc.identifier.doi10.18653/v1/W18-64037en_US
dc.identifier.isbn978-1-948087-81-0
dc.identifier.otherPURE UUID: 66d66e8f-6c09-4763-84ad-532d0e9809e2en_US
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/66d66e8f-6c09-4763-84ad-532d0e9809e2en_US
dc.identifier.otherPURE LINK: http://aclweb.org/anthology/W18-6410en_US
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/30804144/WMT037_1_.pdfen_US
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/35969
dc.identifier.urnURN:NBN:fi:aalto-201901141152
dc.language.isoenen
dc.relationinfo:eu-repo/grantAgreement/EC/H2020/780069/EU//MeMADen_US
dc.relation.ispartofConference on Machine Translationen
dc.relation.ispartofseriesThird Conference on Machine Translation (WMT18); Brussels, Belgiumen
dc.rightsopenAccessen
dc.subject.keywordneural machine translationen_US
dc.subject.keywordmorphologyen_US
dc.subject.keywordcognateen_US
dc.subject.keywordmultilingualen_US
dc.titleCognate-aware morphological segmentation for multilingual neural translationen
dc.typeA4 Artikkeli konferenssijulkaisussafi
dc.type.versionpublishedVersion
Files