Supervised distance preserving projections for dimensionality reduction

No Thumbnail Available

URL

Journal Title

Journal ISSN

Volume Title

School of Science | Master's thesis
Checking the digitized thesis and permission for publishing
Instructions for the author

Date

2011

Major/Subject

Informaatiotekniikka

Mcode

T-61

Degree programme

Language

en

Pages

ix + 53

Series

Abstract

Facing with high-dimensional data, dimensionality reduction is an essential technique for overcoming the "curse of dimensionality" problem. This work focuses on supervised dimensionality reduction, especially for regression tasks. The goal of dimensionality reduction for regression is to learn a low-dimensional representation of the original high-dimensional data such that this new representation leads to accurate regression predictions. Motivated by continuity preservation, we propose a novel algorithm for supervised dimensionality reduction named Supervised Distance Preserving Projection (SDPP). In order to preserve the continuity in the low-dimensional subspace, we resort to considering the local geometrical structure of the original input space and response space. Inside a neighborhood of each point in the input space, the optimization criterion of SDPP tries to minimize the difference between the distances of the projected covariates and distances of the responses. Consequently, this minimization of distance differences leads to the effect that the local geometrical structure of the low-dimensional subspace optimally matches the geometrical characteristics of the response space. The local match not only facilitates an accurate regressor design but also uncovers the necessary information for visualization. Different optimization schemes are proposed for solving SDPP efficiently. Moreover, the parametric mapping we learned can easily handle the out-of-sample data points. A kernelized version of SDPP is derived for nonlinear data. An intuitive extension of SDPP is also presented for classification tasks. We compare the performance of our method with state-of-the-art algorithms on both synthetic and real-world data. These comparisons show the superiority of our approach on the task of dimensionality reduction for regression and classification.

Description

Supervisor

Simula, Olli

Thesis advisor

Corona, Francesco

Keywords

supervised dimensionality reduction, regression, classification, optimization, kernel

Other note

Citation