A decision-theoretic approach for model interpretability in Bayesian framework

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorAfrabandpey, Homayunen_US
dc.contributor.authorPeltola, Tomien_US
dc.contributor.authorPiironen, Juhoen_US
dc.contributor.authorVehtari, Akien_US
dc.contributor.authorKaski, Samuelen_US
dc.contributor.departmentDepartment of Computer Scienceen
dc.contributor.groupauthorCentre of Excellence in Computational Inference, COINen
dc.contributor.groupauthorProbabilistic Machine Learningen
dc.contributor.groupauthorHelsinki Institute for Information Technology (HIIT)en
dc.contributor.groupauthorProfessorship Vehtari Akien
dc.contributor.groupauthorFinnish Center for Artificial Intelligence, FCAIen
dc.contributor.groupauthorProfessorship Kaski Samuelen
dc.date.accessioned2020-10-02T06:25:14Z
dc.date.available2020-10-02T06:25:14Z
dc.date.issued2020-09-01en_US
dc.description.abstractA salient approach to interpretable machine learning is to restrict modeling to simple models. In the Bayesian framework, this can be pursued by restricting the model structure and prior to favor interpretable models. Fundamentally, however, interpretability is about users’ preferences, not the data generation mechanism; it is more natural to formulate interpretability as a utility function. In this work, we propose an interpretability utility, which explicates the trade-off between explanation fidelity and interpretability in the Bayesian framework. The method consists of two steps. First, a reference model, possibly a black-box Bayesian predictive model which does not compromise accuracy, is fitted to the training data. Second, a proxy model from an interpretable model family that best mimics the predictive behaviour of the reference model is found by optimizing the interpretability utility function. The approach is model agnostic—neither the interpretable model nor the reference model are restricted to a certain class of models—and the optimization problem can be solved using standard tools. Through experiments on real-word data sets, using decision trees as interpretable models and Bayesian additive regression models as reference models, we show that for the same level of interpretability, our approach generates more accurate models than the alternative of restricting the prior. We also propose a systematic way to measure stability of interpretabile models constructed by different interpretability approaches and show that our proposed approach generates more stable models.en
dc.description.versionPeer revieweden
dc.format.extent22
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationAfrabandpey, H, Peltola, T, Piironen, J, Vehtari, A & Kaski, S 2020, ' A decision-theoretic approach for model interpretability in Bayesian framework ', Machine Learning, vol. 109, no. 9-10, pp. 1855-1876 . https://doi.org/10.1007/s10994-020-05901-8en
dc.identifier.doi10.1007/s10994-020-05901-8en_US
dc.identifier.issn0885-6125
dc.identifier.issn1573-0565
dc.identifier.otherPURE UUID: d7104696-1a54-44be-8e86-45503100f175en_US
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/d7104696-1a54-44be-8e86-45503100f175en_US
dc.identifier.otherPURE LINK: http://www.scopus.com/inward/record.url?scp=85090308542&partnerID=8YFLogxK
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/51833858/Afrabandpey2020_Article_ADecision_theoreticApproachFor.pdfen_US
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/46824
dc.identifier.urnURN:NBN:fi:aalto-202010025789
dc.language.isoenen
dc.publisherSpringer
dc.relation.ispartofseriesMachine Learningen
dc.relation.ispartofseriesVolume 109, issue 9-10, pp. 1855-1876en
dc.rightsopenAccessen
dc.subject.keywordBayesian predictive modelsen_US
dc.subject.keywordInterpretable machine learningen_US
dc.titleA decision-theoretic approach for model interpretability in Bayesian frameworken
dc.typeA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessäfi
dc.type.versionpublishedVersion

Files