Morfessor 2.0: Python Implementation and Extensions for Morfessor Baseline
| dc.contributor | Aalto-yliopisto | fi |
| dc.contributor | Aalto University | en |
| dc.contributor.author | Virpioja, Sami | |
| dc.contributor.author | Smit, Peter | |
| dc.contributor.author | Grönroos, Stig-Arne | |
| dc.contributor.author | Kurimo, Mikko | |
| dc.contributor.department | Signaalinkäsittelyn ja akustiikan laitos | fi |
| dc.contributor.department | Department of Signal Processing and Acoustics | en |
| dc.contributor.school | Sähkötekniikan korkeakoulu | fi |
| dc.contributor.school | School of Electrical Engineering | en |
| dc.date.accessioned | 2013-12-12T10:00:59Z | |
| dc.date.available | 2013-12-12T10:00:59Z | |
| dc.date.issued | 2013 | |
| dc.description.abstract | Morfessor is a family of probabilistic machine learning methods that find morphological segmentations for words of a natural language, based solely on raw text data. After the release of the public implementations of the Morfessor Baseline and Categories-MAP methods in 2005, they have become popular as automatic tools for processing morphologically complex languages for applications such as speech recognition and machine translation. This report describes a new implementation of the Morfessor Baseline method. The new version not only fixes the main restrictions of the previous software, but also includes recent methodological extensions such as semi-supervised learning, which can make use of small amounts of manually segmented words. Experimental results for the various features of the implementation are reported for English and Finnish segmentation tasks. | en |
| dc.format.extent | 38 | |
| dc.format.mimetype | application/pdf | |
| dc.identifier.isbn | 978-952-60-5501-5 (electronic) | |
| dc.identifier.issn | 1799-490X (electronic) | |
| dc.identifier.issn | 1799-4896 (printed) | |
| dc.identifier.issn | 1799-4896 (ISSN-L) | |
| dc.identifier.uri | https://aaltodoc.aalto.fi/handle/123456789/11836 | |
| dc.identifier.urn | URN:ISBN:978-952-60-5501-5 | |
| dc.language.iso | en | en |
| dc.publisher | Aalto University | en |
| dc.publisher | Aalto-yliopisto | fi |
| dc.relation.ispartofseries | Aalto University publication series SCIENCE + TECHNOLOGY | en |
| dc.relation.ispartofseries | 25/2013 | |
| dc.subject.keyword | morpheme segmentation | en |
| dc.subject.keyword | morphology induction | en |
| dc.subject.keyword | unsupervised learning | en |
| dc.subject.keyword | semi-supervised learning | en |
| dc.subject.keyword | morfessor | en |
| dc.subject.keyword | machine learning | en |
| dc.subject.other | Computer science | en |
| dc.subject.other | Linguistics | |
| dc.title | Morfessor 2.0: Python Implementation and Extensions for Morfessor Baseline | en |
| dc.type | D4 Julkaistu kehittämis- tai tutkimusraportti tai -selvitys | fi |
| dc.type.dcmitype | text | en |