Insightful dimensionality reduction with very low rank variable subsets

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorOrdozgoiti, Brunoen_US
dc.contributor.authorPai, Sachithen_US
dc.contributor.authorKolczynska, Martaen_US
dc.contributor.departmentDepartment of Computer Scienceen_US
dc.contributor.departmentUniversity of Helsinkien_US
dc.contributor.departmentPolish Academy of Sciencesen_US
dc.date.accessioned2022-01-13T09:41:54Z
dc.date.available2022-01-13T09:41:54Z
dc.date.issued2021-06-03en_US
dc.descriptionFunding Information: This work was supported by the Academy of Finland project AIDA (317085), the EC H2020RIA project “SoBigData++” (871042), and the Polish National Agency for Academic Exchange within the Bekker programme, number PPN/BEK/2019/1/00133. Publisher Copyright: © 2021 ACM. | openaire: EC/H2020/871042/EU//SoBigData-PlusPlus
dc.description.abstractDimensionality reduction techniques can be employed to produce robust, cost-effective predictive models, and to enhance interpretability in exploratory data analysis. However, the models produced by many of these methods are formulated in terms of abstract factors or are too high-dimensional to facilitate insight and fit within low computational budgets. In this paper we explore an alternative approach to interpretable dimensionality reduction. Given a data matrix, we study the following question: are there subsets of variables that can be primarily explained by a single factor? We formulate this challenge as the problem of finding submatrices close to rank one. Despite its potential, this topic has not been sufficiently addressed in the literature, and there exist virtually no algorithms for this purpose that are simultaneously effective, efficient and scalable. We formalize the task as two problems which we characterize in terms of computational complexity, and propose efficient, scalable algorithms with approximation guarantees. Our experiments demonstrate how our approach can produce insightful findings in data, and show our algorithms to be superior to strong baselines.en
dc.description.versionPeer revieweden
dc.format.extent10
dc.format.extent3066-3075
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationOrdozgoiti , B , Pai , S & Kolczynska , M 2021 , Insightful dimensionality reduction with very low rank variable subsets . in Proceedings of the Web Conference, WWW 2021 . ACM , pp. 3066-3075 , The Web Conference , Ljubljana , Slovenia , 19/04/2021 . https://doi.org/10.1145/3442381.3450067en
dc.identifier.doi10.1145/3442381.3450067en_US
dc.identifier.isbn9781450383127
dc.identifier.otherPURE UUID: f8927396-4cf0-4c3e-a369-5171f0dc2af6en_US
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/f8927396-4cf0-4c3e-a369-5171f0dc2af6en_US
dc.identifier.otherPURE LINK: http://www.scopus.com/inward/record.url?scp=85107975359&partnerID=8YFLogxKen_US
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/78104493/SCI_Ordozgoiti_etal_Insightful_Dimensionality_WWW_2021.pdfen_US
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/112289
dc.identifier.urnURN:NBN:fi:aalto-202201131196
dc.language.isoenen
dc.relationinfo:eu-repo/grantAgreement/EC/H2020/871042/EU//SoBigData-PlusPlusen_US
dc.relation.ispartofThe Web Conferenceen
dc.relation.ispartofseriesProceedings of the Web Conference, WWW 2021en
dc.rightsopenAccessen
dc.subject.keywordData miningen_US
dc.subject.keywordDimensionality reductionen_US
dc.subject.keywordExplainabilityen_US
dc.subject.keywordVariable selectionen_US
dc.titleInsightful dimensionality reduction with very low rank variable subsetsen
dc.typeConference article in proceedingsfi
dc.type.versionpublishedVersion
Files