Benchmark of Self-supervised Graph Neural Networks
Loading...
URL
Journal Title
Journal ISSN
Volume Title
Perustieteiden korkeakoulu |
Master's thesis
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Authors
Date
2022-07-29
Department
Major/Subject
Machine Learning, Data Science and Artificial Intelligence (Macadamia)
Mcode
SCI3044
Degree programme
Master’s Programme in Computer, Communication and Information Sciences
Language
en
Pages
52+2
Series
Abstract
A graph is an abstract data structure with abundant applications, such as social networks, biochemical molecules, and traffic maps. Graph neural networks (GNNs), deep learning tools which adapt to irregular non-Euclidean space, are designed for such graph data with heavy reliance on manual labels. Learning generalizable and reliable representation for unlabeled graph-structured data has become an attractive and trending task in academia because of the promising application scenarios. Recently, numerous SSL-GNN algorithms have been proposed with success on this task. However, the proposed methods are often evaluated with different architecture and evaluation processes on different small-scale datasets, resulting in unreliable model comparisons. To address this problem, this thesis aims to build a benchmark with a unified framework, a standard evaluation process, and replaceable blocks. In this thesis, a benchmark of SSL-GNNs algorithms is created with the implementation of 9 state-of-art algorithms. These algorithms are compared on this benchmark with consistent settings: shared structure of the GNN encoder, pre-training and fine-tuning scheme, and a unified evaluation protocol. Each model is pre-trained on large-scale datasets: ZINC-15 with two million molecular data and then fine-tuned on eight biophysical downstream datasets for the graph classification task. The experiment results support that two of the nine algorithms outperform others under the benchmark set. Furthermore, the comparison between algorithms also shows the correlation between the pre-training dataset and certain fine-tuning datasets, and the correlation is analyzed by the model mechanisms. The implemented benchmark and discoveries in this thesis are expected to promote transfer learning on graph representation learning.Description
Supervisor
Solin, ArnoThesis advisor
Verma, VikasKeywords
machine learning, benchmark, graph neural networks, self-supervised learning, pre-training and fine-tuning