Performance analysis of neural likelihood approximation methods for decision making models

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

Perustieteiden korkeakoulu | Master's thesis

Date

2024-06-18

Department

Major/Subject

Applied Mathematics

Mcode

SCI3053

Degree programme

Master’s Programme in Mathematics and Operations Research

Language

en

Pages

33+10

Series

Abstract

Computational models of decision making are central in modern experimental sciences. This thesis draws inspiration from cognitive neuroscience and drift-diffusion models of decision making. The process of decision making, or evidence accumulation, is modelled as a random walk based on several parameters: starting point, decision bound and rate of convergence. Inferring these parameters for posterior estimation is a task usually approached with simulation based Bayesian inference. Major issue in this field is that most of the times associated likelihood can not be explicitly computed in their analytical form. This problem of likelihood intractability is approached by the methods of density approximation. In this thesis we analyze established claims about kernel density estimation (KDE) and attempt to reproduce these classic results with modern dataset size. Further, recent literature suggests several machine learning methods for density approximation, likelihood approximation networks (LAN) and mixed neural likelihood estimators (MNLE). These networks have a completely different approach to simulation based Bayesian inference: LANs are based on KDE and are rained as a multi-layer perceptron, whereas MNLEs are a modern density estimation network based on normalizing flows. We analyze the performance and computational costs of these two neural network models. We start with the analysis of existing literature on KDE methods and show experimentally that with modern hardware main issues with non-parametric kernel density estimation such as kernel choice and bandwidth selection become much less relevant with bigger datasets. As one of the networks architecture is based on KDE-generated datasets, our results also suggest that the reduced relevance of kernel choice and bandwidth translates to the predictions of the LAN. Analyzing the performance of two neural networks we demonstrate that simple multi-layer perceptron architecture trained on KDE data quickly produces results which are very similar to training-costly learning of normalizing flow’s transformations.

Description

Supervisor

Ilmonen, Pauliina

Thesis advisor

Fengler, Alexander

Keywords

density estimation, likelihood approximation, neural likelihood, normalizing flow, kernel density estimation, bayesian inference

Other note

Citation