Empirical analysis of prediction models for churn and customer lifetime value in freemium mobile games

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
School of Business | Master's thesis
Degree programme
Information and Service Management (ISM)
61 + 3
In mobile gaming, the freemium business model of downloading the game for free and paying for voluntary in-app purchases has become popular. However, this has made it hard to predict when players stop playing the game and how much revenue they will generate during their lifetime. These prediction tasks are called churn and customer lifetime value (CLV) predictions respectively. The thesis aims at answering two questions. First, what the current academic state-of- the-art solutions are to predict churn and CLV in a noncontractual and continuous setting, and second if the suggested models work in a real business setting for freemium mobile games. The empirical analysis is performed with data from a Finnish mobile gaming company and the target variables are measured 30 days after a new player’s registration. Both probabilistic models and machine learning models are in scope. A family of probabilistic models called ‘Buy-till-you-die’ models are popular for noncontractual settings, and two of them are included in the both prediction tasks of the empirical analysis. Within machine learning, the ensemble methods of gradient boosting and random forests have been effective in both classification and regression tasks, and deep neural networks have been found suitable in some of the latest studies. The empirical analysis is done in two parts, for churn and CLV separately, as they represent different prediction problems, classification and regression respectively. The achieved predictive performance is on par with earlier literature. For churn, no single model outperforms the rest, but logistic regression and random forest classifier are recommended due to their easy and transparent implementation. Linear regression and random forest regressor perform best in CLV prediction and they the recommended choice. An observation period of one week is suitable for predictor variables, and a longer period does not improve model performance. Hence, it is recommended that practitioners utilize the aforementioned models with an observation period of one week. In addition, feature importances of the models indicate which features of the game affect churn and CLV, and companies can focus on improving these factors. In the analyzed game, sessions and sum of transactions from the observation period have by far most predictive power, indicating that in-game behavior would not affect outcomes. CLV prediction results were accurate enough to enable customer segmentation, but not sufficient to accurately predict future cash flows.
Thesis advisor
Malo, Pekka
Luoma, Jukka
churn, customer lifetime value, CLV, LTV, freemium, mobile games, prediction, machine learning