Latent Derivative Bayesian Last Layer Networks

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorWatson, Joeen_US
dc.contributor.authorLin, Jihao Andreasen_US
dc.contributor.authorKlink, Pascalen_US
dc.contributor.authorPajarinen, Jonien_US
dc.contributor.authorPeters, Janen_US
dc.contributor.departmentDepartment of Electrical Engineering and Automationen
dc.contributor.editorBanerjee, Aen_US
dc.contributor.editorFukumizu, Ken_US
dc.contributor.groupauthorRobot Learningen
dc.contributor.organizationTechnische Universität Darmstadten_US
dc.date.accessioned2021-09-29T09:59:22Z
dc.date.available2021-09-29T09:59:22Z
dc.date.issued2021en_US
dc.description.abstractBayesian neural networks (BNN) are powerful parametric models for nonlinear regression with uncertainty quantification. However, the approximate inference techniques for weight space priors suffer from several drawbacks. The 'Bayesian last layer' (BLL) is an alternative BNN approach that learns the feature space for an exact Bayesian linear model with explicit predictive distributions. However, its predictions outside of the data distribution (OOD) are typically overconfident, as the marginal likelihood objective results in a learned feature space that overfits to the data. We overcome this weakness by introducing a functional prior on the model's derivatives w.r.t. the inputs. Treating these Jacobians as latent variables, we incorporate the prior into the objective to influence the smoothness and diversity of the features, which enables greater predictive uncertainty. For the BLL, the Jacobians can be computed directly using forward mode automatic differentiation, and the distribution over Jacobians may be obtained in closed-form. We demonstrate this method enhances the BLL to Gaussian process-like performance on tasks where calibrated uncertainty is critical: OOD regression, Bayesian optimization and active learning, which include high-dimensional real-world datasets.en
dc.description.versionPeer revieweden
dc.format.extent13
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationWatson, J, Lin, J A, Klink, P, Pajarinen, J & Peters, J 2021, Latent Derivative Bayesian Last Layer Networks. in A Banerjee & K Fukumizu (eds), 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS). Proceedings of Machine Learning Research, vol. 130, JMLR, pp. 1198-1206, International Conference on Artificial Intelligence and Statistics, Virtual, Online, 13/04/2021. < http://proceedings.mlr.press/v130/watson21a/watson21a.pdf >en
dc.identifier.issn2640-3498
dc.identifier.otherPURE UUID: 96d12bfc-6b0e-41cc-8072-ea502937371den_US
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/96d12bfc-6b0e-41cc-8072-ea502937371den_US
dc.identifier.otherPURE LINK: http://proceedings.mlr.press/v130/watson21a/watson21a.pdfen_US
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/67645279/watson21a_Copy.pdfen_US
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/110177
dc.identifier.urnURN:NBN:fi:aalto-202109299377
dc.language.isoenen
dc.relation.ispartofInternational Conference on Artificial Intelligence and Statisticsen
dc.relation.ispartofseries24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS)en
dc.relation.ispartofseriespp. 1198-1206en
dc.relation.ispartofseriesProceedings of Machine Learning Research ; Volume 130en
dc.rightsopenAccessen
dc.titleLatent Derivative Bayesian Last Layer Networksen
dc.typeA4 Artikkeli konferenssijulkaisussafi
dc.type.versionpublishedVersion

Files