Supervised distance preserving projections for dimensionality reduction
No Thumbnail Available
URL
Journal Title
Journal ISSN
Volume Title
School of Science |
Master's thesis
Checking the digitized thesis and permission for publishing
Instructions for the author
Instructions for the author
Authors
Date
2011
Department
Major/Subject
Informaatiotekniikka
Mcode
T-61
Degree programme
Language
en
Pages
ix + 53
Series
Abstract
Facing with high-dimensional data, dimensionality reduction is an essential technique for overcoming the "curse of dimensionality" problem. This work focuses on supervised dimensionality reduction, especially for regression tasks. The goal of dimensionality reduction for regression is to learn a low-dimensional representation of the original high-dimensional data such that this new representation leads to accurate regression predictions. Motivated by continuity preservation, we propose a novel algorithm for supervised dimensionality reduction named Supervised Distance Preserving Projection (SDPP). In order to preserve the continuity in the low-dimensional subspace, we resort to considering the local geometrical structure of the original input space and response space. Inside a neighborhood of each point in the input space, the optimization criterion of SDPP tries to minimize the difference between the distances of the projected covariates and distances of the responses. Consequently, this minimization of distance differences leads to the effect that the local geometrical structure of the low-dimensional subspace optimally matches the geometrical characteristics of the response space. The local match not only facilitates an accurate regressor design but also uncovers the necessary information for visualization. Different optimization schemes are proposed for solving SDPP efficiently. Moreover, the parametric mapping we learned can easily handle the out-of-sample data points. A kernelized version of SDPP is derived for nonlinear data. An intuitive extension of SDPP is also presented for classification tasks. We compare the performance of our method with state-of-the-art algorithms on both synthetic and real-world data. These comparisons show the superiority of our approach on the task of dimensionality reduction for regression and classification.Description
Supervisor
Simula, OlliThesis advisor
Corona, FrancescoKeywords
supervised dimensionality reduction, regression, classification, optimization, kernel