AutoML: Comparing performance with human-designed solutions in Kaggle competitions

dc.contributorAalto Universityen
dc.contributorAalto-yliopistofi
dc.contributor.advisorMalo, Pekka
dc.contributor.authorHolopainen, Aleksi
dc.contributor.departmentTieto- ja palvelujohtamisen laitosfi
dc.contributor.schoolKauppakorkeakoulufi
dc.contributor.schoolSchool of Businessen
dc.date.accessioned2024-08-18T16:02:10Z
dc.date.available2024-08-18T16:02:10Z
dc.date.issued2024
dc.description.abstractThe adoption of Machine Learning (ML) has been a vital point of interest for organizations globally, but its adoption has been slowed down by high costs related to expert personnel and computational power. However, as high computational power has become cheaper and more available, a solution is emerging that solves the need for technical skills required of ML experts: AutoML. They are tools that aim to automate the ML pipeline in a way that domain experts can also start to develop their own predictive models thus further democratizing ML. This paper surveys different techniques used to automate the pipeline and compares results gained by using a newly released AutoML tool against human-designed solutions by utilizing Kaggle competitions. The results are also benchmarked against other frameworks based on the study by Erickson et al. (2020). Furthermore, it proposes a theoretical framework that can be used to assess an ML task’s difficultness while testing AutoML tools. The research consisted of taking part in 10 relatively recent competitions that had a large number of submissions and included binary classification, regression, and multiclass classification ML tasks. Based on the results, the utilized AutoML tool was on average better than a third of the human competitors. The research implicated that having a larger dataset, relatively more numerical features, and the task being binary classification had a negative impact on the framework’s performance. Compared to the other 6 frameworks, it had below average results. To summarise, using only AutoML tools to create a model is fast but it comes at a notable cost to its performance.en
dc.format.extent45+5
dc.format.mimetypeapplication/pdfen
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/129866
dc.identifier.urnURN:NBN:fi:aalto-202408185430
dc.language.isoenen
dc.locationP1 Ifi
dc.programmeInformation and Service Management (ISM)en
dc.subject.keywordmachine learningen
dc.subject.keywordautomlen
dc.subject.keywordbenchmarken
dc.subject.keywordkaggleen
dc.subject.keywordqlik automlen
dc.titleAutoML: Comparing performance with human-designed solutions in Kaggle competitionsen
dc.titleAutoML: Suorituskyvyn vertaaminen ihmisen suunnittelemiin ratkaisuihin Kaggle kilpailuissafi
dc.typeG2 Pro gradu, diplomityöfi
dc.type.ontasotMaster's thesisen
dc.type.ontasotMaisterin opinnäytefi
local.aalto.electroniconlyyes
local.aalto.openaccessyes
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
master_Holopainen_Aleksi_2024.pdf
Size:
627.21 KB
Format:
Adobe Portable Document Format