Essays on Convex Regression and Frontier Estimation

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
School of Business | Doctoral thesis (article-based) | Defence date: 2022-10-07
Degree programme
40 + app. 154
Aalto University publication series DOCTORAL THESES, 111/2022
Convex regression is increasingly popular in economics, finance, operations research, machine learning, and statistics. In the productivity and efficiency analysis field, convex regression and its latest development have bridged the long-standing gap between the conventional deterministic nonparametric and stochastic-parametric methods. This dissertation presents three scientific essays that contribute to the field of convex regression and frontier estimation from methodological, computational, and empirical aspects.     Essay I develops a new L0-norm regularization approach to the convex quantile and expectile regressions for subset selection. In the paper, I show how to use mixed-integer programming to solve the proposed L0-norm regularization approach in practice and build a link to the commonly used L1-norm regularization approach. A Monte Carlo study is performed to compare the finite sample performances of the proposed L0-penalized convex quantile and expectile regression approaches with the L1-norm regularization approaches. The proposed approach is further applied to benchmark the sustainable development performance of the OECD countries and empirically analyze the accuracy in the dimensionality reduction of variables. The results from the simulation and application illustrate that the proposed L0-norm regularization approach can more effectively address the curse of dimensionality than the L1-norm regularization approach in multidimensional spaces.       Essay II is motivated by computational and pedagogical needs. The heavy computational burden and the lack of powerful, reliable, and fully open access computational packages have slowed down the diffusion of these advanced estimation techniques to empirical practice. The Python package pyStoNED aims to address this challenge by providing a freely available and user-friendly tool for multivariate convex regression, convex quantile regression, convex expectile regression, and stochastic nonparametric envelopment of data. This paper presents a tutorial on the pyStoNED package and illustrates its application.       Essay III contributes to convex quantile and expectile regressions to more accurately estimate shadow prices and marginal abatement costs of bad outputs. Specifically, using panel data of 30 Chinese provinces during 1997–2015, we first estimate the marginal CO2 abatement costs using a novel data-driven approach, convex quantile regression. Based on the marginal abatement cost estimates and China's plans regarding carbon intensity reduction and economic growth, we present a forward-looking assessment of the abatement costs for Chinese provinces for 2016–2020. Our main finding is that all the Chinese provinces have a negative abatement cost, which means these provinces can benefit from an increase in the absolute level of CO2 emissions despite the constraint on carbon intensity. 
Supervising professor
Kuosmanen, Timo, Prof., University of Turku, Turku School of Economics, Finland
Thesis advisor
Kuosmanen, Timo, Prof., University of Turku, Turku School of Economics, Finland
abatement cost, convex regression, convex quantile regression, frontier estimation, regularization, subset selection
Other note
  • [Publication 1]: Sheng Dai. Variable selection in convex quantile regression: L1-norm or L0-norm regularization? European Journal of Operational Research, In press, May 2022.
    DOI: 10.1016/j.ejor.2022.05.041 View at publisher
  • [Publication 2]: Sheng Dai, Yu-Hsueh Fang, Chia-Yen Lee, Timo Kuosmanen. pyStoNED: A Python package for convex regression and frontier estimation. Revised and resubmitted to Journal of Statistical Software, May 2022.
  • [Publication 3]: Sheng Dai, Xun Zhou, Timo Kuosmanen. Forward-looking assessment of the GHG abatement cost: Application to China. Energy Economics, Volume 88, 104758, May 2020.
    DOI: 10.1016/j.eneco.2020.104758 View at publisher