Performance analysis of neural likelihood approximation methods for decision making models
Loading...
URL
Journal Title
Journal ISSN
Volume Title
Perustieteiden korkeakoulu |
Master's thesis
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Authors
Date
2024-06-18
Department
Major/Subject
Applied Mathematics
Mcode
SCI3053
Degree programme
Master’s Programme in Mathematics and Operations Research
Language
en
Pages
33+10
Series
Abstract
Computational models of decision making are central in modern experimental sciences. This thesis draws inspiration from cognitive neuroscience and drift-diffusion models of decision making. The process of decision making, or evidence accumulation, is modelled as a random walk based on several parameters: starting point, decision bound and rate of convergence. Inferring these parameters for posterior estimation is a task usually approached with simulation based Bayesian inference. Major issue in this field is that most of the times associated likelihood can not be explicitly computed in their analytical form. This problem of likelihood intractability is approached by the methods of density approximation. In this thesis we analyze established claims about kernel density estimation (KDE) and attempt to reproduce these classic results with modern dataset size. Further, recent literature suggests several machine learning methods for density approximation, likelihood approximation networks (LAN) and mixed neural likelihood estimators (MNLE). These networks have a completely different approach to simulation based Bayesian inference: LANs are based on KDE and are rained as a multi-layer perceptron, whereas MNLEs are a modern density estimation network based on normalizing flows. We analyze the performance and computational costs of these two neural network models. We start with the analysis of existing literature on KDE methods and show experimentally that with modern hardware main issues with non-parametric kernel density estimation such as kernel choice and bandwidth selection become much less relevant with bigger datasets. As one of the networks architecture is based on KDE-generated datasets, our results also suggest that the reduced relevance of kernel choice and bandwidth translates to the predictions of the LAN. Analyzing the performance of two neural networks we demonstrate that simple multi-layer perceptron architecture trained on KDE data quickly produces results which are very similar to training-costly learning of normalizing flow’s transformations.Description
Supervisor
Ilmonen, PauliinaThesis advisor
Fengler, AlexanderKeywords
density estimation, likelihood approximation, neural likelihood, normalizing flow, kernel density estimation, bayesian inference