Interpreting Multivariate Time Series for an Organization Health Platform
Perustieteiden korkeakoulu | Master's thesis
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Master's Programme in ICT Innovation
70 + 1
AbstractMachine learning-based systems are rapidly becoming popular because it has been realized that machines are more efficient and effective than humans at performing certain tasks. Although machine learning algorithms are extremely popular, they are also very literal and undeviating. This has led to a huge research surge in the field of interpretability in machine learning to ensure that machine learning models are reliable, fair, and can be held liable for their decision-making process. Moreover, in most real-world problems just making predictions using machine learning algorithms only solves the problem partially. Time series is one of the most popular and important data types because of its dominant presence in the fields of business, economics, and engineering. Despite this, interpretability in time series is still relatively unexplored as compared to tabular, text, and image data. With the growing research in the field of interpretability in machine learning, there is also a pressing need to be able to quantify the quality of explanations produced after interpreting machine learning models. Due to this reason, evaluation of interpretability is extremely important. The evaluation of interpretability for models built on time series seems completely unexplored in research circles. This thesis work focused on achieving and evaluating model agnostic interpretability in a time series forecasting problem. The use case discussed in this thesis work focused on finding a solution to a problem faced by a digital consultancy company. The digital consultancy wants to take a data-driven approach to understand the effect of various sales related activities in the company on the sales deals closed by the company. The solution involved framing the problem as a time series forecasting problem to predict the sales deals and interpreting the underlying forecasting model. The interpretability was achieved using two novel model agnostic interpretability techniques, Local interpretable model- agnostic explanations (LIME) and Shapley additive explanations (SHAP). The explanations produced after achieving interpretability were evaluated using human evaluation of interpretability. The results of the human evaluation studies clearly indicate that the explanations produced by LIME and SHAP greatly helped lay humans in understanding the predictions made by the machine learning model. The human evaluation study results also indicated that LIME and SHAP explanations were almost equally understandable with LIME performing better but with a very small margin. The work done during this project can easily be extended to any time series forecasting or classification scenario for achieving and evaluating interpretability. Furthermore, this work can offer a very good framework for achieving and evaluating interpretability in any machine learning-based regression or classification problem.
Thesis advisorCavdar, Cicek
interpretability, forecasting, LIME, SHAP, time series, explainable artificial intelligence