Interpreting Multivariate Time Series for an Organization Health Platform

dc.contributorAalto Universityen
dc.contributor.advisorCavdar, Cicek
dc.contributor.advisorMalhi, Avleen
dc.contributor.authorSaluja, Rohit
dc.contributor.schoolPerustieteiden korkeakoulufi
dc.contributor.supervisorVästberg, Anders
dc.description.abstractMachine learning-based systems are rapidly becoming popular because it has been realized that machines are more efficient and effective than humans at performing certain tasks. Although machine learning algorithms are extremely popular, they are also very literal and undeviating. This has led to a huge research surge in the field of interpretability in machine learning to ensure that machine learning models are reliable, fair, and can be held liable for their decision-making process. Moreover, in most real-world problems just making predictions using machine learning algorithms only solves the problem partially. Time series is one of the most popular and important data types because of its dominant presence in the fields of business, economics, and engineering. Despite this, interpretability in time series is still relatively unexplored as compared to tabular, text, and image data. With the growing research in the field of interpretability in machine learning, there is also a pressing need to be able to quantify the quality of explanations produced after interpreting machine learning models. Due to this reason, evaluation of interpretability is extremely important. The evaluation of interpretability for models built on time series seems completely unexplored in research circles. This thesis work focused on achieving and evaluating model agnostic interpretability in a time series forecasting problem. The use case discussed in this thesis work focused on finding a solution to a problem faced by a digital consultancy company. The digital consultancy wants to take a data-driven approach to understand the effect of various sales related activities in the company on the sales deals closed by the company. The solution involved framing the problem as a time series forecasting problem to predict the sales deals and interpreting the underlying forecasting model. The interpretability was achieved using two novel model agnostic interpretability techniques, Local interpretable model- agnostic explanations (LIME) and Shapley additive explanations (SHAP). The explanations produced after achieving interpretability were evaluated using human evaluation of interpretability. The results of the human evaluation studies clearly indicate that the explanations produced by LIME and SHAP greatly helped lay humans in understanding the predictions made by the machine learning model. The human evaluation study results also indicated that LIME and SHAP explanations were almost equally understandable with LIME performing better but with a very small margin. The work done during this project can easily be extended to any time series forecasting or classification scenario for achieving and evaluating interpretability. Furthermore, this work can offer a very good framework for achieving and evaluating interpretability in any machine learning-based regression or classification problem.en
dc.format.extent70 + 1
dc.programmeMaster's Programme in ICT Innovationfi
dc.programme.majorAutonomous Systemsfi
dc.subject.keywordtime seriesen
dc.subject.keywordexplainable artificial intelligenceen
dc.titleInterpreting Multivariate Time Series for an Organization Health Platformen
dc.typeG2 Pro gradu, diplomityöfi
dc.type.ontasotMaster's thesisen
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
2.37 MB
Adobe Portable Document Format