Computational methods for comparison and exploration of event sequences

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.advisor Mannila, Heikki, Prof., Aalto University, Department of Information and Computer Science, Finland
dc.contributor.author Lijffijt, Jefrey
dc.date.accessioned 2013-12-03T10:01:22Z
dc.date.available 2013-12-03T10:01:22Z
dc.date.issued 2013
dc.identifier.isbn 978-952-60-5475-9 (electronic)
dc.identifier.isbn 978-952-60-5474-2 (printed)
dc.identifier.issn 1799-4942 (electronic)
dc.identifier.issn 1799-4934 (printed)
dc.identifier.issn 1799-4934 (ISSN-L)
dc.identifier.uri https://aaltodoc.aalto.fi/handle/123456789/11798
dc.description.abstract Many types of data, e.g., natural language texts, biological sequences, or time series of sensor data, contain sequential structure. Analysis of such sequential structure is interesting for various reasons, for example, to detect that data consists of several homogeneous parts, that data contains certain recurring patterns, or to find parts that are different or surprising compared to the rest of the data. The main question studied in this thesis is how to identify global and local patterns in event sequences. Within this broad topic, we study several subproblems. The first problem that we address is how to compare event frequencies across event sequences and databases of event sequences. Such comparisons are relevant, for example, to linguists who are interested in comparing word counts between two corpora to identify linguistic differences, e.g., between groups of speakers, or language change over time. The second problem that we address is how to find areas in an event sequence where an event has a surprisingly high or low frequency. More specifically, we study how to take into account the multiple testing problem when looking for local frequency deviations in event sequences. Many algorithms for finding local patterns in event sequences require that the person applying the algorithm chooses the level of granularity at which the algorithm operates, and it is often not clear how to choose that level. The third problem that we address is which granularities to use when looking for local patterns in an event sequence. The main contributions of this thesis are computational methods that can be used to compare and explore (databases of) event sequences with high computational efficiency, increased accuracy, and that offer new perspectives on the sequential structure of data. Furthermore, we illustrate how the proposed methods can be applied to solve practical data analysis tasks, and describe several experiments and case studies where the methods are applied on various types of data. The primary focus is on natural language texts, but we also study DNA sequences and sensor data. We find that the methods work well in practice and that they can efficiently uncover various types of interesting patterns in the data. en
dc.format.extent 116
dc.format.mimetype application/pdf
dc.language.iso en en
dc.publisher Aalto University en
dc.publisher Aalto-yliopisto fi
dc.relation.ispartofseries Aalto University publication series DOCTORAL DISSERTATIONS en
dc.relation.ispartofseries 205/2013
dc.subject.other Computer science en
dc.title Computational methods for comparison and exploration of event sequences en
dc.type G4 Monografiaväitöskirja fi
dc.contributor.school Perustieteiden korkeakoulu fi
dc.contributor.school School of Science en
dc.contributor.department Tietojenkäsittelytieteen laitos fi
dc.contributor.department Department of Information and Computer Science en
dc.subject.keyword pattern mining en
dc.subject.keyword event sequence en
dc.subject.keyword statistical significance en
dc.subject.keyword multiple testing en
dc.subject.keyword sliding window en
dc.subject.keyword window length en
dc.identifier.urn URN:ISBN:978-952-60-5475-9
dc.type.dcmitype text en
dc.type.ontasot Doctoral dissertation (monograph) en
dc.type.ontasot Väitöskirja (monografia) fi
dc.contributor.supervisor Rousu, Juho, Prof., Aalto University, Department of Information and Computer Science, Finland
dc.opn Goethals, Bart, Prof., University of Antwerp, Dept. of Math and Computer Science, Belgium
dc.rev Geerts, Floris, Prof., Universiteit Antwerpen, Belgium; Boulicaut, Jean-François, Prof., Institut National des Sciences Appliquées de Lyon, France
dc.date.defence 2013-12-16


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search archive


Advanced Search

article-iconSubmit a publication

Browse

My Account