Learning Embeddings for Graphs and Other High Dimensional Data
Sähkötekniikan korkeakoulu | Master's thesis
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Machine Learning, Data Science and Artificial Intelligence
CCIS - Master’s Programme in Computer, Communication and Information Sciences (TS2013)
AbstractAn immense amount of data is nowadays produced on a daily basis and extracting knowledge from such data proves fruitful for many scientific purposes. Machine learning algorithms are means to such end and have morphed from a nascent research field to omnipresent algorithms running in the background of many applications we use on a daily basis. Low-dimensionality of data, however, is highly conducive to efficient machine learning methods. However, real-world data is seldom low-dimensional; on the contrary, real-world data can be starkly high-dimensional. Such high-dimensional data is exemplified by graph-structured data, such as biological networks of protein-protein interaction, social networks, etc., on which machine learning techniques in their traditional form cannot easily be applied. The focus of this report is thus to explore algorithms whose aim is to generate representation vectors that best encode structural information of the vertices of graphs. The vectors can be in turn passed onto down-stream machine learning algorithms to classify nodes or predict links among them. This study is firstly prefaced by introducing dimensionality reduction techniques for data residing in geometric spaces, followed by two techniques for embedding vertices of graphs into low-dimensional spaces.
Thesis advisorChalermsook, Parinya
machine learning, dimensionality reduction, graph embeddings, random walks