AutoML: Comparing performance with human-designed solutions in Kaggle competitions
dc.contributor | Aalto University | en |
dc.contributor | Aalto-yliopisto | fi |
dc.contributor.advisor | Malo, Pekka | |
dc.contributor.author | Holopainen, Aleksi | |
dc.contributor.department | Tieto- ja palvelujohtamisen laitos | fi |
dc.contributor.school | Kauppakorkeakoulu | fi |
dc.contributor.school | School of Business | en |
dc.date.accessioned | 2024-08-18T16:02:10Z | |
dc.date.available | 2024-08-18T16:02:10Z | |
dc.date.issued | 2024 | |
dc.description.abstract | The adoption of Machine Learning (ML) has been a vital point of interest for organizations globally, but its adoption has been slowed down by high costs related to expert personnel and computational power. However, as high computational power has become cheaper and more available, a solution is emerging that solves the need for technical skills required of ML experts: AutoML. They are tools that aim to automate the ML pipeline in a way that domain experts can also start to develop their own predictive models thus further democratizing ML. This paper surveys different techniques used to automate the pipeline and compares results gained by using a newly released AutoML tool against human-designed solutions by utilizing Kaggle competitions. The results are also benchmarked against other frameworks based on the study by Erickson et al. (2020). Furthermore, it proposes a theoretical framework that can be used to assess an ML task’s difficultness while testing AutoML tools. The research consisted of taking part in 10 relatively recent competitions that had a large number of submissions and included binary classification, regression, and multiclass classification ML tasks. Based on the results, the utilized AutoML tool was on average better than a third of the human competitors. The research implicated that having a larger dataset, relatively more numerical features, and the task being binary classification had a negative impact on the framework’s performance. Compared to the other 6 frameworks, it had below average results. To summarise, using only AutoML tools to create a model is fast but it comes at a notable cost to its performance. | en |
dc.format.extent | 45+5 | |
dc.format.mimetype | application/pdf | en |
dc.identifier.uri | https://aaltodoc.aalto.fi/handle/123456789/129866 | |
dc.identifier.urn | URN:NBN:fi:aalto-202408185430 | |
dc.language.iso | en | en |
dc.location | P1 I | fi |
dc.programme | Information and Service Management (ISM) | en |
dc.subject.keyword | machine learning | en |
dc.subject.keyword | automl | en |
dc.subject.keyword | benchmark | en |
dc.subject.keyword | kaggle | en |
dc.subject.keyword | qlik automl | en |
dc.title | AutoML: Comparing performance with human-designed solutions in Kaggle competitions | en |
dc.title | AutoML: Suorituskyvyn vertaaminen ihmisen suunnittelemiin ratkaisuihin Kaggle kilpailuissa | fi |
dc.type | G2 Pro gradu, diplomityö | fi |
dc.type.ontasot | Master's thesis | en |
dc.type.ontasot | Maisterin opinnäyte | fi |
local.aalto.electroniconly | yes | |
local.aalto.openaccess | yes |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- master_Holopainen_Aleksi_2024.pdf
- Size:
- 627.21 KB
- Format:
- Adobe Portable Document Format