Distributed Bayesian matrix factorization with limited communication

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorQin, Xiangju
dc.contributor.authorBlomstedt, Paul
dc.contributor.authorLeppäaho, Eemeli
dc.contributor.authorParviainen, Pekka
dc.contributor.authorKaski, Samuel
dc.contributor.departmentProbabilistic Machine Learning
dc.contributor.departmentCentre of Excellence in Computational Inference, COIN
dc.contributor.departmentDepartment of Computer Science
dc.date.accessioned2019-05-06T09:19:15Z
dc.date.available2019-05-06T09:19:15Z
dc.date.issued2019-01-01
dc.description| openaire: EC/H2020/671555/EU//ExCAPE
dc.description.abstractBayesian matrix factorization (BMF) is a powerful tool for producing low-rank representations of matrices and for predicting missing values and providing confidence intervals. Scaling up the posterior inference for massive-scale matrices is challenging and requires distributing both data and computation over many workers, making communication the main computational bottleneck. Embarrassingly parallel inference would remove the communication needed, by using completely independent computations on different data subsets, but it suffers from the inherent unidentifiability of BMF solutions. We introduce a hierarchical decomposition of the joint posterior distribution, which couples the subset inferences, allowing for embarrassingly parallel computations in a sequence of at most three stages. Using an efficient approximate implementation, we show improvements empirically on both real and simulated data. Our distributed approach is able to achieve a speed-up of almost an order of magnitude over the full posterior, with a negligible effect on predictive accuracy. Our method outperforms state-of-the-art embarrassingly parallel MCMC methods in accuracy, and achieves results competitive to other available distributed and parallel implementations of BMF.en
dc.description.versionPeer revieweden
dc.format.extent1-26
dc.format.mimetypeapplication/pdf
dc.identifier.citationQin , X , Blomstedt , P , Leppäaho , E , Parviainen , P & Kaski , S 2019 , ' Distributed Bayesian matrix factorization with limited communication ' , Machine Learning , pp. 1-26 . https://doi.org/10.1007/s10994-019-05778-2en
dc.identifier.doi10.1007/s10994-019-05778-2
dc.identifier.issn0885-6125
dc.identifier.issn1573-0565
dc.identifier.otherPURE UUID: 8f440f9a-370a-48fd-8433-4318b7a976ba
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/distributed-bayesian-matrix-factorization-with-limited-communication(8f440f9a-370a-48fd-8433-4318b7a976ba).html
dc.identifier.otherPURE LINK: http://www.scopus.com/inward/record.url?scp=85064242641&partnerID=8YFLogxK
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/33413162/Qin2019_Article_DistributedBayesianMatrixFacto.pdf
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/37722
dc.identifier.urnURN:NBN:fi:aalto-201905062840
dc.language.isoenen
dc.publisherSpringer Netherlands
dc.relationinfo:eu-repo/grantAgreement/EC/H2020/671555/EU//ExCAPE
dc.relation.ispartofseriesMachine Learningen
dc.rightsopenAccessen
dc.subject.keywordBayesian matrix factorization
dc.subject.keywordDistributed inference
dc.subject.keywordEmbarrassingly parallel MCMC
dc.subject.keywordPosterior propagation
dc.subject.keywordSoftware
dc.subject.keywordArtificial Intelligence
dc.subject.keyword113 Computer and information sciences
dc.subject.otherSoftwareen
dc.subject.otherArtificial Intelligenceen
dc.subject.other113 Computer and information sciencesen
dc.titleDistributed Bayesian matrix factorization with limited communicationen
dc.typeA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessäfi
dc.type.versionpublishedVersion

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Qin2019_Article_DistributedBayesianMatrixFacto.pdf
Size:
1.18 MB
Format:
Adobe Portable Document Format