Sparsity Driven Statistical Learning for High-Dimensional Regression and Classification

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
School of Electrical Engineering | Doctoral thesis (article-based) | Defence date: 2020-05-14
Degree programme
89 + app. 58
Aalto University publication series DOCTORAL DISSERTATIONS, 67/2020
Statistical analysis and modeling techniques are needed to acquire information from a plethora of high-dimensional (HD) data which are being generated due to digitalization, an increase of computing power, sensors and smart devices in our everyday life. Statistical learning from HD data is still a challenging problem despite the continuous improvement of computational resources and learning techniques. The performance of supervised learning approaches, such as regression or classification, often degrades when there exists an insufficient number of observations (samples) compared to data dimensionality (variables). Developing supervised learning methods for explanatory and predictive modeling of such HD data sets is crucial. Therefore, in this thesis, we propose new sparsity-driven methods for regression and classification which offer improved explanatory and predictive powers.  In this thesis, new solvers are proposed for sparsity-driven linear regression problems, namely Lasso and elastic net (EN), which are specially designed to handle complex-valued data. These solvers are applied for explanatory modeling to estimate the direction-of-arrivals (DoAs) of impinging sources to a sensor array using compressed beamforming (CBF) technique. The developed methods are, however, completely general and can be applied in various HD linear regression problems dealing with complex- or real-valued data. Moreover, an approach called the sequential adaptive EN is developed to enhance the recovery of the exact support of the sparse signal vector. This is then used to find the DoAs of sources using the CBF framework. Furthermore, the regularization paths of the Lasso and EN computed by the developed algorithm and generalized information criterion are used in proposing a novel method for detecting the sparsity level of the signal, which corresponds to the number of sources in DoA estimation problem.  This thesis also proposes a compressive classification framework for predicting the class of high-dimensional observation. The proposed compressive regularized discriminant analysis (CRDA)-based set of classifiers is applied for feature selection and classification of HD data, particularly gene expression data. CRDA-based approach outperforms current state-of-the-art methods that fail at least in one of the three facets, namely accuracy, learning speed and interpretability.
The public defense on 14th May 2020 at 16:00 (4 p.m.) will be organized via remote technology. Link: Zoom Quick Guide:
Supervising professor
Ollila, Esa, Assoc. Prof., Aalto University, Department of Signal Processing and Acoustics, Finland
Thesis advisor
Ollila, Esa, Assoc. Prof., Aalto University, Department of Signal Processing and Acoustics, Finland
classification, compressibility, feature selection, high-dimensional statistics, joint-sparse recovery, regression, sparsity, statistical learning
Other note
  • [Publication 1]: M. N. Tabassum and E. Ollila. Single-snapshot DoA estimation using adaptive elastic net in the complex domain. In Proc. of 4th IEEE International Workshop on Compressed Sensing Theory and its Applications to Radar, Sonar and Remote Sensing (CoSeRa), Aachen, Germany, pp. 197-201, Sept. 2016.
    DOI: 10.1109/CoSeRa.2016.7745728 View at publisher
  • [Publication 2]: M. N. Tabassum and E. Ollila. Pathwise least angle regression and a significance test for the elastic net. In Proc. of 25th European Signal Processing Conference (EUSIPCO), Kos, Greece, pp. 1309-1313, 28 Aug.– 2 Sept. 2017.
    DOI: 10.23919/EUSIPCO.2017.8081420 View at publisher
  • [Publication 3]: M. N. Tabassum and E. Ollila. Compressive Regularized Discriminant Analysis of High-Dimensional Data with Applications to Microarray Studies. In Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, pp. 4204-4208, April 2018.
    DOI: 10.1109/ICASSP.2018.8462328 View at publisher
  • [Publication 4]: M. N. Tabassum and E. Ollila. Sequential adaptive elastic net approach for single-snapshot source localization. The Journal of the Acoustical Society of America, vol. 143, no. 6, pp. 3873–3882, Jun. 2018.
    DOI: 10.1121/1.5042363 View at publisher
  • [Publication 5]: M. N. Tabassum and E. Ollila. Simultaneous Signal Subspace Rank and Model Selection with an Application to Single-snapshot Source Localization. In Proc. of 26th European Signal Processing Conference (EUSIPCO), Rome, Italy, pp. 1592-1596, Sept. 2018.
    DOI: 10.23919/EUSIPCO.2018.8553171 View at publisher
  • [Publication 6]: M. N. Tabassum and E. Ollila. A Compressive Classification Framework for High-Dimensional Data. Submitted, Jan. 2020.