Machine learning-based predictive modelling of house prices
Loading...
URL
Journal Title
Journal ISSN
Volume Title
School of Business |
Master's thesis
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Authors
Date
2025-02-11
Department
Major/Subject
Mcode
Degree programme
Master's programme in Information and Service Management
Language
en
Pages
67
Series
Abstract
Traditionally, hedonic price models have been employed to estimate residential property prices. Nevertheless, with the era of big data, machine learning methodologies have gained reputation for providing a more sophisticated alternative for the prediction of property prices. This thesis presents a machine learning framework aimed at predicting residential property prices. The algorithms under consideration include linear regression, lasso regression, random forests, and XGBoost (extreme gradient boosting). The analysis utilizes real-world data from property transactions, including a range of location-specific and socioeconomic attributes in addition to structural features of the property. The study focuses on properties situated in the three cities: Helsinki, Espoo, and Vantaa. The types of properties examined include apartments (kerrostalo), row houses (rivitalo), and detached houses (omakotitalo). The results reveal that both random forests and XGBoost significantly outperform classical methods such as multiple linear regression and lasso regression, as evidenced by their lower mean squared errors across nearly all housing categories. This outcome aligns with previous research, suggesting that the application of machine learning techniques in the domain of housing price prediction is not only feasible but also proves to be more effective than conventional hedonic price models. One of the key findings of this study is that location definitely plays an important role in determining property prices, with independent variables demonstrating varying levels of impact across different cities. In addition, floor area and the building condition emerge as crucial factors affecting pricing among other predictors. Interestingly, despite the majority of older building structures in Helsinki, the city maintains a relatively higher average price range compared to Espoo and Vantaa. The framework developed in this thesis can be served as a pricing tool to help different stakeholders participating in the housing market make better decisions and investment strategies. This development highlights the growing importance of machine learning in real estate economics and reflects the transforming landscape of property valuation due to the availability of modern data.Description
Supervisor
Viitasaari, LauriKeywords
house price prediction, machine learning, multiple linear regression, lasso regression, random forest, xgboost