Real-Time Match Outcome Prediction in Elite Soccer
No Thumbnail Available
URL
Journal Title
Journal ISSN
Volume Title
Perustieteiden korkeakoulu |
Master's thesis
Authors
Date
2021-12-13
Department
Major/Subject
Data Science
Mcode
SCI3095
Degree programme
Master's Programme in ICT Innovation
Language
en
Pages
62 + 15
Series
Abstract
Over the years, match outcome prediction in soccer has become a prominent subject in both academic and industrial research. In a field where data is becoming increasingly accessible, making accurate forecasts about an infamously unpredictable game constitutes a challenging task for state-of-the-art machine learning models. The applications are not only found in the sports betting industry, but also within the clubs themselves, as the team's results play a predominant role in their ecosystem. To date, while most studies focus on pre-match predictions, the rapid development of companies dedicated to collecting fine-grained soccer data is also opening up new opportunities in the field of real-time analysis in which predicted outcome probabilities could be continuously updated based on the match context or the team's tactical changes. This study proposes two novel approaches for the real-time match outcome prediction task in soccer. The first approach (MC model) uses Monte Carlo methods on the shot generation and conversion processes to simulate the match from any given time. The inter-arrival times of shots and their conversion probabilities are modelled using Weibull and logistic regression models respectively, whose parameters are learned via Markov Chain Monte Carlo algorithms. The second approach (MLP model) is an extension that uses a multilayer perceptron to build upon the learnt generation and conversion parameters and various other soccer KPIs. The models predictive power is measure against Dixon and Coles' approach on the 2018-19 and 2019-20 English Premier League seasons. The study demonstrates that the MLP model clearly outperform this baseline and the MC model on real-time settings, even when the amount of training matches is highly restricted. In addition, it is shown that goal difference and time are the two KPIs standing out in terms of permutation importance score, which would tend to rule out models that are not able to account for similar match progression features for the match prediction problem. Finally, the study illustrates a use case of real-time predictions for post-match analysis and discusses several possible improvements.Description
Supervisor
Babbar, RohitThesis advisor
Karavolos, DanielKeywords
sports, soccer, outcome prediction, deep learning, Bayesian inference, weibull