Predicting work disability period lengths utilizing machine learning methods

No Thumbnail Available

URL

Journal Title

Journal ISSN

Volume Title

School of Business | Master's thesis

Date

2024

Major/Subject

Mcode

Degree programme

Information and Service Management (ISM)

Language

en

Pages

50 + 7

Series

Abstract

There were 114 500 occupational accidents in Finland in 2023 resulting in over half a billion euros of work disability related compensations paid by Finnish insurance companies. Predicting the lengths of the work disability periods in an early phase of the claim process, the companies could prioritize the more severe cases and even communicate about them to their corporate customers. The European Statistics on Safety at Work (ESAW) coding practices provide a useful, structured data framework for attempting predictive analyses on work disability period lengths. However, the ESAW-variables have been scarcely utilized in such research. We classified work disability period lengths in three classes: 0-2 days, 3-30 days and 31 or more days utilizing machine learning methodologies. The three final models trained were XGBoost with hyperparameter optimization utilizing Tree-structured Parzen Estimator (TPE), Multinomial Logistic Regression, and Random Forest Classifier. The best performing model was XGBoost with TPE hyperparameter optimization that achieved an accuracy of 57% and a Macro F1-score of 0,50. By analyzing feature importances, we also found out that the ESAW-variables type of injury and injured body part offered the most explanatory value of all the included variables. This research provides a multiclass classification benchmark on work disability length prediction utilizing ESAW-variables. We found that XGBoost was the best performing machine learning method and concluded that more thorough data from the occupational accidents in the form of accident descriptions would be needed to achieve more accurate predictions.

Description

Thesis advisor

Vilkkumaa, Eeva

Keywords

occupational accidents, work disability period length prediction, ESAW, machine learning

Other note

Citation