Machine learning in applied econometrics: Deriving personal income drivers with randomized decision forests

No Thumbnail Available

URL

Journal Title

Journal ISSN

Volume Title

School of Business | Master's thesis
Ask about the availability of the thesis by sending email to the Aalto University Learning Centre oppimiskeskus@aalto.fi

Date

2016

Major/Subject

Kansantaloustiede
Economics

Mcode

Degree programme

Language

en

Pages

66

Series

Abstract

In this paper I explore a modern field of research in applied econometrics: machine learning and the estimation of synthetic treatment effects. Data generation is currently on an exponential growth path: smart phones, social media and networks of interconnected devices are generating information at an unprecedented pace. The size, structure and velocity of these information streams vary to a great extent. The field of econometrics is also evolving: classic econometric models can lead to biased results with big data and will not scale up to modern data sets. I propose the well- performing Random Forests algorithm for use in econometrics. To adjust this method for causal analysis, recent theory on causal decision trees is explored. The proposed framework is then tested by estimating personal income drivers for the top 1% in U.S. population. The data used is the American Community Survey 5- year sample consisting of approximately 20 million rows. It appears that high income is in fact driven by four core factors: education, experience, working hours and gender. To rank these predictors, a synthetic treatment effect simulation is run. I find that investing in education after a master's degree has a significant positive effect in the likelihood of high income. Additionally, it appears that the negative gender income effect for females can be undone with a combination of work experience and exceptional work- ethic.

Description

Keywords

econometrics, machine learning, decision trees, causality, big data, random forests, income, american community survey

Other note

Citation