Low-rank doubly stochastic matrix decomposition for cluster analysis

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.author Yang, Zhirong
dc.contributor.author Corander, Jukka
dc.contributor.author Oja, Erkki
dc.date.accessioned 2018-08-01T13:31:43Z
dc.date.available 2018-08-01T13:31:43Z
dc.date.issued 2016-10-01
dc.identifier.citation Yang , Z , Corander , J & Oja , E 2016 , ' Low-rank doubly stochastic matrix decomposition for cluster analysis ' Journal of Machine Learning Research , vol 17 . en
dc.identifier.issn 1532-4435
dc.identifier.issn 1533-7928
dc.identifier.other PURE UUID: 9f23c732-2eee-47d1-8653-2b1e6378f937
dc.identifier.other PURE ITEMURL: https://research.aalto.fi/en/publications/lowrank-doubly-stochastic-matrix-decomposition-for-cluster-analysis(9f23c732-2eee-47d1-8653-2b1e6378f937).html
dc.identifier.other PURE LINK: http://www.scopus.com/inward/record.url?scp=84995477290&partnerID=8YFLogxK
dc.identifier.other PURE LINK: http://www.jmlr.org/papers/volume17/15-549/15-549.pdf
dc.identifier.other PURE FILEURL: https://research.aalto.fi/files/26703463/15_549_1.pdf
dc.identifier.uri https://aaltodoc.aalto.fi/handle/123456789/32860
dc.description.abstract Cluster analysis by nonnegative low-rank approximations has experienced a remarkable progress in the past decade. However, the majority of such approximation approaches are still restricted to nonnegative matrix factorization (NMF) and su er from the following two drawbacks: 1) they are unable to produce balanced partitions for large-scale manifold data which are common in real-world clustering tasks; 2) most existing NMF-type clustering methods cannot automatically determine the number of clusters. We propose a new low-rank learning method to address these two problems, which is beyond matrix factorization. Our method approximately decomposes a sparse input similarity in a normalized way and its objective can be used to learn both cluster assignments and the number of clusters. For efficient optimization, we use a relaxed formulation based on Data-Cluster-Data random walk, which is also shown to be equivalent to low-rank factorization of the doublystochastically normalized cluster incidence matrix. The probabilistic cluster assignments can thus be learned with a multiplicative majorization-minimization algorithm. Experimental results show that the new method is more accurate both in terms of clustering large-scale manifold data sets and of selecting the number of clusters. en
dc.format.extent 25
dc.format.mimetype application/pdf
dc.language.iso en en
dc.relation.ispartofseries Journal of Machine Learning Research en
dc.relation.ispartofseries Volume 17 en
dc.rights openAccess en
dc.subject.other Control and Systems Engineering en
dc.subject.other Software en
dc.subject.other Statistics and Probability en
dc.subject.other Artificial Intelligence en
dc.subject.other 113 Computer and information sciences en
dc.title Low-rank doubly stochastic matrix decomposition for cluster analysis en
dc.type A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä fi
dc.description.version Peer reviewed en
dc.contributor.department Tietojenkäsittelytieteen laito
dc.contributor.department Department of Computer Science
dc.subject.keyword Cluster analysis
dc.subject.keyword Doubly stochastic matrix
dc.subject.keyword Manifold
dc.subject.keyword Multiplicative updates
dc.subject.keyword Probabilistic relaxation
dc.subject.keyword Control and Systems Engineering
dc.subject.keyword Software
dc.subject.keyword Statistics and Probability
dc.subject.keyword Artificial Intelligence
dc.subject.keyword 113 Computer and information sciences
dc.identifier.urn URN:NBN:fi:aalto-201808014261
dc.type.version publishedVersion

Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search archive

Advanced Search

article-iconSubmit a publication


My Account