Comparison of dynamic cluster methods
No Thumbnail Available
URL
Journal Title
Journal ISSN
Volume Title
Perustieteiden korkeakoulu |
Master's thesis
Authors
Date
2020-08-18
Department
Major/Subject
Computer Science
Mcode
SCI3042
Degree programme
Master’s Programme in Computer, Communication and Information Sciences
Language
en
Pages
69+13
Series
Abstract
Traditional clustering algorithms are popular for data analysis. However, by nature, they are designed for static objects. There are many applications where objects change their states over time. If to apply traditional clustering methods on every timestamp, this may produce totally different clusters on each timestamp. Instead, it would be preferable to obtain clusters smoothly evolving over time. Starting with Chakrabarti et al. \cite{Chakrabarti} in 2006, the era of dynamic clusters came to the world. Since then, evolutionary versions of clusters for popular algorithms were designed: Kmeans, hierarchical, spectral, DBSCAN. They assume that the algorithm cost function consists of a traditional algorithm cost and a temporal smoothness. Then Xu et al. \cite{AdaptiveEvolutionary} proposed an adaptive framework where smoothing was applied to a proximity matrix followed by static clustering. In the thesis, we compare the performance of the traditional algorithms Kmeans, spectral and DBSCAN with their evolutionary and adaptive modifications. For the experiments, two datasets were chosen: a synthetic, generated from four Gaussian distributions and simulated four seasons, and a Foursquare, with six seasons for venues in San Francisco. The criteria for performance evaluation are clustering goodness and smoothness between consecutive timestamps. Also, the similarity between clustering results was estimated. For the synthetic data, although the adaptive and evolutionary algorithms show best values, the traditional Kmeans and spectral algorithms differ from them neglectfully. This means there is no need to apply sophisticated algorithms. The results on the Foursquare data show that the evolutionary spectral versions perform better than the other algorithms taking into account a goodness-smoothness trade-off.Description
Supervisor
Gionis, AristidesThesis advisor
Vasara, PetriKeywords
evolution clustering, dynamic clustering, temporal smoothness, foursquare, clustering