Theory of Mind Based Models in Human-AI Interaction

No Thumbnail Available

URL

Journal Title

Journal ISSN

Volume Title

Perustieteiden korkeakoulu | Master's thesis

Date

2018-12-10

Department

Major/Subject

Machine Learning, Data Science and Artificial Intelligence

Mcode

SCI3044

Degree programme

Master’s Programme in Computer, Communication and Information Sciences

Language

en

Pages

49

Series

Abstract

Humans are social animals. They have goals, they make plans, they collaborate and compete. The richness of human-human interaction is immense. Yet, the way modern AI systems model their interaction with human users does not take these aspects into account. Often times human feedback is modelled as samples from an unknown but fixed probability distribution. These models are not able to capture the active planning aspect of real humans. The underlying motivation of this thesis is that the performance of human-AI collaboration is limited by the parties' ability of modelling each others' minds. In human-human interaction, this ability is called the theory of mind, and it is shown to be a limiting factor in human teams' task performance by cognitive science studies. In order to examine the effects of having theory of mind based user models, we define a multi-armed bandit setting where the system takes into account that the user is able to anticipate the system's behaviour multiple steps ahead, and strategically plan her feedback. We compare the performance of our proposed setting to the standard multi-armed bandit setting where the feedback is assumed to be samples from an unknown probability distribution. Empirical results demonstrate that better reward performance and ranking of arms are achieved when users can behave strategically and the system takes this into account. The results indicate that the performance of human-AI teams increase based on how well the parties can model each other and use their models to plan their interaction.

Description

Supervisor

Kaski, Samuel

Thesis advisor

Peltola, Tomi

Keywords

Bayesian modelling, human-AI collaboration, interactive systems, inverse reinforcement learning, multi-armed bandits, theory of mind

Other note

Citation