Methods for probabilistic modeling of knowledge elicitation for improving machine learning predictions
Loading...
URL
Journal Title
Journal ISSN
Volume Title
School of Science |
Doctoral thesis (article-based)
| Defence date: 2020-12-18
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Authors
Date
2020
Major/Subject
Mcode
Degree programme
Language
en
Pages
64 + app. 66
Series
Aalto University publication series DOCTORAL DISSERTATIONS, 210/2020
Abstract
Many applications of supervised machine learning consist of training data with a large number of features and small sample size. Constructing models with reliable predictive performance in such applications is challenging. To alleviate these challenges, either more samples are required, which could be very difficult or even impossible in some applications to obtain, or additional sources of information are required to regularize models. One of the additional sources of information is the domain expert, however, extracting knowledge from a human expert can itself be difficult; it will require some computer systems that experts could effectively and effortlessly interact with. This thesis proposes novel knowledge elicitation approaches, to improve the predictive performance of statistical models. The first contribution of this thesis is to develop methods that incorporate different types of knowledge on features extracted from domain expert, into the construction of the machine learning model. Several solutions are proposed for knowledge elicitation, including interactive visualization of the effect of feedback on features, and active learning. Experiments demonstrate that the proposed methods improve the predictive performance of an underlying model through adoption of limited interaction with the user. The second contribution of the thesis is to develop a new approach to the interpretability of Bayesian predictive models to facilitate the interaction of human users with Bayesian black-box predictive models. The proposed approach separates model specification from model interpretation, via a two-stage decision--theoretical approach: first construct a highly predictive model without compromising accuracy and then optimize the interpretability. Conducted experiments demonstrate that the proposed method constructs models which are more accurate, and yet more interpretable than the alternative practice of incorporation of interpretability constraints into the model specification via prior distribution.Description
Supervising professor
Kaski, Samuel, Prof., Aalto University, Department of Computer Science, FinlandThesis advisor
Peltola, Tomi, Dr., Aalto University, Department of Computer Science, FinlandKeywords
machine learning, knowledge elicitation
Other note
Parts
-
[Publication 1]: Seppo Virtanen, Homayun Afrabandpey, and Samuel Kaski. Visualizations relevant to the user by multi-view latent variable factorization. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2464–2468, IEEE, 2016.
DOI: 10.1109/ICASSP.2016.7472120 View at publisher
-
[Publication 2]: Homayun Afrabandpey, Tomi Peltola, and Samuel Kaski. Interactive prior elicitation of feature similarities for small sample size prediction. In Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization (UMAP), pp. 265–269, ACM, 2017.
DOI: 10.1145/3079628.3079698 View at publisher
-
[Publication 3]: Iiris Sundin, Tomi Peltola, Luana Micallef, Homayun Afrabandpey, Marta Soare, Muntasir Mamun Majumder, Pedram Daee, Chen He, Baris Serim, Aki Havulinna, Caroline Heckman, Giulio Jacucci, Pekka Marttinen, and Samuel Kaski. Improving genomics-based predictions for preci- sion medicine through active elicitation of expert knowledge. Bioinformatics, 34, 13, pp. i395–i403, 2018.
Full text in Acris/Aaltodoc: http://urn.fi/URN:NBN:fi:aalto-201808014342DOI: 10.1093/bioinformatics/bty257 View at publisher
-
[Publication 4]: Homayun Afrabandpey, Tomi Peltola, and Samuel Kaski. Human-in-the- loop active covariance learning for improving prediction in small data sets. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI), pp. 1959–1966, AAAI Press, 2019.
DOI: 10.24963/ijcai.2019/271 View at publisher
-
[Publication 5]: Homayun Afrabandpey, Tomi Peltola, Juho Piironen, Aki Vehtari, and Samuel Kaski. A decision–theoretic approach for model interpretability inBayesian framework. Machine Learning, 109, 9, pp. 1855–1876, 2020.
Full text in Acris/Aaltodoc: http://urn.fi/URN:NBN:fi:aalto-202010025789DOI: 10.1007/s10994-020-05901-8 View at publisher