Low-rank doubly stochastic matrix decomposition for cluster analysis

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorYang, Zhirong
dc.contributor.authorCorander, Jukka
dc.contributor.authorOja, Erkki
dc.contributor.departmentUniversity of Helsinki
dc.contributor.departmentDepartment of Computer Science
dc.date.accessioned2018-08-01T13:31:43Z
dc.date.available2018-08-01T13:31:43Z
dc.date.issued2016-10-01
dc.description.abstractCluster analysis by nonnegative low-rank approximations has experienced a remarkable progress in the past decade. However, the majority of such approximation approaches are still restricted to nonnegative matrix factorization (NMF) and su er from the following two drawbacks: 1) they are unable to produce balanced partitions for large-scale manifold data which are common in real-world clustering tasks; 2) most existing NMF-type clustering methods cannot automatically determine the number of clusters. We propose a new low-rank learning method to address these two problems, which is beyond matrix factorization. Our method approximately decomposes a sparse input similarity in a normalized way and its objective can be used to learn both cluster assignments and the number of clusters. For efficient optimization, we use a relaxed formulation based on Data-Cluster-Data random walk, which is also shown to be equivalent to low-rank factorization of the doublystochastically normalized cluster incidence matrix. The probabilistic cluster assignments can thus be learned with a multiplicative majorization-minimization algorithm. Experimental results show that the new method is more accurate both in terms of clustering large-scale manifold data sets and of selecting the number of clusters.en
dc.description.versionPeer revieweden
dc.format.extent25
dc.format.mimetypeapplication/pdf
dc.identifier.citationYang , Z , Corander , J & Oja , E 2016 , ' Low-rank doubly stochastic matrix decomposition for cluster analysis ' , Journal of Machine Learning Research , vol. 17 . < http://www.jmlr.org/papers/volume17/15-549/15-549.pdf >en
dc.identifier.issn1532-4435
dc.identifier.issn1533-7928
dc.identifier.otherPURE UUID: 9f23c732-2eee-47d1-8653-2b1e6378f937
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/9f23c732-2eee-47d1-8653-2b1e6378f937
dc.identifier.otherPURE LINK: http://www.scopus.com/inward/record.url?scp=84995477290&partnerID=8YFLogxK
dc.identifier.otherPURE LINK: http://www.jmlr.org/papers/volume17/15-549/15-549.pdf
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/26703463/15_549_1.pdf
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/32860
dc.identifier.urnURN:NBN:fi:aalto-201808014261
dc.language.isoenen
dc.relation.ispartofseriesJournal of Machine Learning Researchen
dc.relation.ispartofseriesVolume 17en
dc.rightsopenAccessen
dc.subject.keywordCluster analysis
dc.subject.keywordDoubly stochastic matrix
dc.subject.keywordManifold
dc.subject.keywordMultiplicative updates
dc.subject.keywordProbabilistic relaxation
dc.titleLow-rank doubly stochastic matrix decomposition for cluster analysisen
dc.typeA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessäfi
dc.type.versionpublishedVersion
Files