Using reference models in variable selection

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorPavone, Federicoen_US
dc.contributor.authorPiironen, Juhoen_US
dc.contributor.authorBürkner, Paul Christianen_US
dc.contributor.authorVehtari, Akien_US
dc.contributor.departmentDepartment of Computer Scienceen
dc.contributor.groupauthorProfessorship Vehtari Akien
dc.contributor.groupauthorProbabilistic Machine Learningen
dc.contributor.groupauthorHelsinki Institute for Information Technology (HIIT)en
dc.contributor.groupauthorComputer Science Professorsen
dc.contributor.groupauthorComputer Science - Artificial Intelligence and Machine Learning (AIML)en
dc.contributor.organizationDepartment of Computer Scienceen_US
dc.date.accessioned2023-03-15T07:10:20Z
dc.date.available2023-03-15T07:10:20Z
dc.date.issued2023-03en_US
dc.descriptionFunding Information: We thank Alejandro Catalina Feliu for help with experiments, and Academy of Finland (Grants 298742, and 313122), Finnish Center for Artificial Intelligence and Technology Industries of Finland Centennial Foundation (Grant 70007503; Artificial Intelligence for Research and Development) for partial support of this research. We also acknowledge the computational resources provided by the Aalto Science-IT project. Publisher Copyright: © 2022, The Author(s).
dc.description.abstractVariable selection, or more generally, model reduction is an important aspect of the statistical workflow aiming to provide insights from data. In this paper, we discuss and demonstrate the benefits of using a reference model in variable selection. A reference model acts as a noise-filter on the target variable by modeling its data generating mechanism. As a result, using the reference model predictions in the model selection procedure reduces the variability and improves stability, leading to improved model selection performance. Assuming that a Bayesian reference model describes the true distribution of future data well, the theoretically preferred usage of the reference model is to project its predictive distribution to a reduced model, leading to projection predictive variable selection approach. We analyse how much the great performance of the projection predictive variable is due to the use of reference model and show that other variable selection methods can also be greatly improved by using the reference model as target instead of the original data. In several numerical experiments, we investigate the performance of the projective prediction approach as well as alternative variable selection methods with and without reference models. Our results indicate that the use of reference models generally translates into better and more stable variable selection.en
dc.description.versionPeer revieweden
dc.format.extent23
dc.format.extent349-371
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationPavone, F, Piironen, J, Bürkner, P C & Vehtari, A 2023, ' Using reference models in variable selection ', Computational Statistics, vol. 38, no. 1, pp. 349-371 . https://doi.org/10.1007/s00180-022-01231-6en
dc.identifier.doi10.1007/s00180-022-01231-6en_US
dc.identifier.issn0943-4062
dc.identifier.issn1613-9658
dc.identifier.otherPURE UUID: db55206d-85d7-4fc8-b9d0-9d08a086525ben_US
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/db55206d-85d7-4fc8-b9d0-9d08a086525ben_US
dc.identifier.otherPURE LINK: http://www.scopus.com/inward/record.url?scp=85130114581&partnerID=8YFLogxKen_US
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/102729195/SCI_Pavone_etal_Computational_Statistics_2023.pdfen_US
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/120103
dc.identifier.urnURN:NBN:fi:aalto-202303152429
dc.language.isoenen
dc.publisherSPRINGER
dc.relation.ispartofseriesComputational Statisticsen
dc.relation.ispartofseriesVolume 38, issue 1en
dc.rightsopenAccessen
dc.subject.keywordBayesian statisticsen_US
dc.subject.keywordModel reductionen_US
dc.subject.keywordProjection predictive approachen_US
dc.titleUsing reference models in variable selectionen
dc.typeA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessäfi
dc.type.versionpublishedVersion

Files