Learning Centre

Morfessor 2.0: Python Implementation and Extensions for Morfessor Baseline

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.author Virpioja, Sami
dc.contributor.author Smit, Peter
dc.contributor.author Grönroos, Stig-Arne
dc.contributor.author Kurimo, Mikko
dc.date.accessioned 2013-12-12T10:00:59Z
dc.date.available 2013-12-12T10:00:59Z
dc.date.issued 2013
dc.identifier.isbn 978-952-60-5501-5 (electronic)
dc.identifier.issn 1799-490X (electronic)
dc.identifier.issn 1799-4896 (printed)
dc.identifier.issn 1799-4896 (ISSN-L)
dc.identifier.uri https://aaltodoc.aalto.fi/handle/123456789/11836
dc.description.abstract Morfessor is a family of probabilistic machine learning methods that find morphological segmentations for words of a natural language, based solely on raw text data. After the release of the public implementations of the Morfessor Baseline and Categories-MAP methods in 2005, they have become popular as automatic tools for processing morphologically complex languages for applications such as speech recognition and machine translation. This report describes a new implementation of the Morfessor Baseline method. The new version not only fixes the main restrictions of the previous software, but also includes recent methodological extensions such as semi-supervised learning, which can make use of small amounts of manually segmented words. Experimental results for the various features of the implementation are reported for English and Finnish segmentation tasks. en
dc.format.extent 38
dc.format.mimetype application/pdf
dc.language.iso en en
dc.publisher Aalto University en
dc.publisher Aalto-yliopisto fi
dc.relation.ispartofseries Aalto University publication series SCIENCE + TECHNOLOGY en
dc.relation.ispartofseries 25/2013
dc.subject.other Computer science en
dc.subject.other Linguistics
dc.title Morfessor 2.0: Python Implementation and Extensions for Morfessor Baseline en
dc.type D4 Julkaistu kehittämis- tai tutkimusraportti tai -selvitys fi
dc.contributor.school Sähkötekniikan korkeakoulu fi
dc.contributor.school School of Electrical Engineering en
dc.contributor.department Signaalinkäsittelyn ja akustiikan laitos fi
dc.contributor.department Department of Signal Processing and Acoustics en
dc.subject.keyword morpheme segmentation en
dc.subject.keyword morphology induction en
dc.subject.keyword unsupervised learning en
dc.subject.keyword semi-supervised learning en
dc.subject.keyword morfessor en
dc.subject.keyword machine learning en
dc.identifier.urn URN:ISBN:978-952-60-5501-5
dc.type.dcmitype text en

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search archive

Advanced Search

article-iconSubmit a publication