Methods for probabilistic modeling of knowledge elicitation for improving machine learning predictions

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
School of Science | Doctoral thesis (article-based) | Defence date: 2020-12-18
Degree programme
64 + app. 66
Aalto University publication series DOCTORAL DISSERTATIONS, 210/2020
Many applications of supervised machine learning consist of training data with a large number of features and small sample size. Constructing models with reliable predictive performance in such applications is challenging. To alleviate these challenges, either more samples are required, which could be very difficult or even impossible in some applications to obtain, or additional sources of information are required to regularize models. One of the additional sources of information is the domain expert, however, extracting knowledge from a human expert can itself be difficult; it will require some computer systems that experts could effectively and effortlessly interact with. This thesis proposes novel knowledge elicitation approaches, to improve the predictive performance of statistical models. The first contribution of this thesis is to develop methods that incorporate different types of knowledge on features extracted from domain expert, into the construction of the machine learning model. Several solutions are proposed for knowledge elicitation, including interactive visualization of the effect of feedback on features, and active learning. Experiments demonstrate that the proposed methods improve the predictive performance of an underlying model through adoption of limited interaction with the user. The second contribution of the thesis is to develop a new approach to the interpretability of Bayesian predictive models to facilitate the interaction of human users with Bayesian black-box predictive models. The proposed approach separates model specification from model interpretation, via a two-stage decision--theoretical approach: first construct a highly predictive model without compromising accuracy and then optimize the interpretability. Conducted experiments demonstrate that the proposed method constructs models which are more accurate, and yet more interpretable than the alternative practice of incorporation of interpretability constraints into the model specification via prior distribution.
Supervising professor
Kaski, Samuel, Prof., Aalto University, Department of Computer Science, Finland
Thesis advisor
Peltola, Tomi, Dr., Aalto University, Department of Computer Science, Finland
machine learning, knowledge elicitation
  • [Publication 1]: Seppo Virtanen, Homayun Afrabandpey, and Samuel Kaski. Visualizations relevant to the user by multi-view latent variable factorization. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2464–2468, IEEE, 2016.
    DOI: 10.1109/ICASSP.2016.7472120 View at publisher
  • [Publication 2]: Homayun Afrabandpey, Tomi Peltola, and Samuel Kaski. Interactive prior elicitation of feature similarities for small sample size prediction. In Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization (UMAP), pp. 265–269, ACM, 2017.
    DOI: 10.1145/3079628.3079698 View at publisher
  • [Publication 3]: Iiris Sundin, Tomi Peltola, Luana Micallef, Homayun Afrabandpey, Marta Soare, Muntasir Mamun Majumder, Pedram Daee, Chen He, Baris Serim, Aki Havulinna, Caroline Heckman, Giulio Jacucci, Pekka Marttinen, and Samuel Kaski. Improving genomics-based predictions for preci- sion medicine through active elicitation of expert knowledge. Bioinformatics, 34, 13, pp. i395–i403, 2018.
    Full text in Acris/Aaltodoc:
    DOI: 10.1093/bioinformatics/bty257 View at publisher
  • [Publication 4]: Homayun Afrabandpey, Tomi Peltola, and Samuel Kaski. Human-in-the- loop active covariance learning for improving prediction in small data sets. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI), pp. 1959–1966, AAAI Press, 2019.
    DOI: 10.24963/ijcai.2019/271 View at publisher
  • [Publication 5]: Homayun Afrabandpey, Tomi Peltola, Juho Piironen, Aki Vehtari, and Samuel Kaski. A decision–theoretic approach for model interpretability inBayesian framework. Machine Learning, 109, 9, pp. 1855–1876, 2020.
    Full text in Acris/Aaltodoc:
    DOI: 10.1007/s10994-020-05901-8 View at publisher