Real-Time Match Outcome Prediction in Elite Soccer

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.advisorKaravolos, Daniel
dc.contributor.authorVuylsteker, Léo
dc.contributor.schoolPerustieteiden korkeakoulufi
dc.contributor.supervisorBabbar, Rohit
dc.date.accessioned2021-12-19T18:03:00Z
dc.date.available2021-12-19T18:03:00Z
dc.date.issued2021-12-13
dc.description.abstractOver the years, match outcome prediction in soccer has become a prominent subject in both academic and industrial research. In a field where data is becoming increasingly accessible, making accurate forecasts about an infamously unpredictable game constitutes a challenging task for state-of-the-art machine learning models. The applications are not only found in the sports betting industry, but also within the clubs themselves, as the team's results play a predominant role in their ecosystem. To date, while most studies focus on pre-match predictions, the rapid development of companies dedicated to collecting fine-grained soccer data is also opening up new opportunities in the field of real-time analysis in which predicted outcome probabilities could be continuously updated based on the match context or the team's tactical changes. This study proposes two novel approaches for the real-time match outcome prediction task in soccer. The first approach (MC model) uses Monte Carlo methods on the shot generation and conversion processes to simulate the match from any given time. The inter-arrival times of shots and their conversion probabilities are modelled using Weibull and logistic regression models respectively, whose parameters are learned via Markov Chain Monte Carlo algorithms. The second approach (MLP model) is an extension that uses a multilayer perceptron to build upon the learnt generation and conversion parameters and various other soccer KPIs. The models predictive power is measure against Dixon and Coles' approach on the 2018-19 and 2019-20 English Premier League seasons. The study demonstrates that the MLP model clearly outperform this baseline and the MC model on real-time settings, even when the amount of training matches is highly restricted. In addition, it is shown that goal difference and time are the two KPIs standing out in terms of permutation importance score, which would tend to rule out models that are not able to account for similar match progression features for the match prediction problem. Finally, the study illustrates a use case of real-time predictions for post-match analysis and discusses several possible improvements.en
dc.format.extent62 + 15
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/111722
dc.identifier.urnURN:NBN:fi:aalto-2021121910863
dc.language.isoenen
dc.programmeMaster's Programme in ICT Innovationfi
dc.programme.majorData Sciencefi
dc.programme.mcodeSCI3095fi
dc.subject.keywordsportsen
dc.subject.keywordsocceren
dc.subject.keywordoutcome predictionen
dc.subject.keyworddeep learningen
dc.subject.keywordBayesian inferenceen
dc.subject.keywordweibullen
dc.titleReal-Time Match Outcome Prediction in Elite Socceren
dc.typeG2 Pro gradu, diplomityöfi
dc.type.ontasotMaster's thesisen
dc.type.ontasotDiplomityöfi
local.aalto.electroniconlyyes
local.aalto.openaccessno

Files