Interactive Intent Modeling Based on Probabilistic Sparse Models

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

Perustieteiden korkeakoulu | Master's thesis

Date

2017-04-03

Department

Major/Subject

Machine Learning and Data Mining

Mcode

SCI3044

Degree programme

Master’s Programme in Computer, Communication and Information Sciences

Language

en

Pages

56+0

Series

Abstract

In the exploratory search system, the user interacts with the system by providing feedback on the relevance of the recommended documents and keywords. It is often that the user is unfamiliar with the topic she is investigating, so the system should be able to help her form a precise query and explore the information space. Typically, the exploratory search process is modeled as a contextual bandit problem, a sequential learning algorithm which adopts the recommendation strategy based on user's feedback, aiming at suggesting more precise keywords and retrieving the most relevant documents with minimum user interactions. One big challenge in the exploratory search is that the corpus in which a bandit algorithm explores is huge while the feedback from the user is always scarce, leading to a non-trivial learning problem with large dimensionality and limited observations. In this thesis, I tackle this challenge by adopting Bayesian linear regression with spike and slab priors which enforce sparsity on the feature space, so the bandit algorithm could narrow down the search to the most relevant documents. I incorporate the Expectation Propagation algorithm to approximate the posterior distribution of the sparse model, Thompson sampling to address the exploration-exploitation dilemma, and Topic model to discover the structure of the documents which could provide group information that can further constrain the search space to specific topics. To assess the models, I simulate the user behavior in an exploratory search process and compare the model coefficients learned by linear models using Gaussian prior and, spike-and-slab prior with or without group information. Several performance metrics are also evaluated. Empirically, the spike-and-slab with or without group information perform similarly and outperform Gaussian prior which does not encourage sparsity. The learned model coefficients justify the assumption that most of the coefficients do not contribute to the model. Besides, the model of group spike-and-slab prior has fewer coefficients needed to be estimated than spike-and-slab prior without group information, and potentially could be applied to larger corpora.

Description

Supervisor

Kaski, Samuel

Thesis advisor

Daee, Pedram

Keywords

exploratory search, Bayesian sparse linear model, multi-armed bandit, Thompson sampling, spike and slab priors

Other note

Citation