Benchmark of Self-supervised Graph Neural Networks

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

Perustieteiden korkeakoulu | Master's thesis

Date

2022-07-29

Department

Major/Subject

Machine Learning, Data Science and Artificial Intelligence (Macadamia)

Mcode

SCI3044

Degree programme

Master’s Programme in Computer, Communication and Information Sciences

Language

en

Pages

52+2

Series

Abstract

A graph is an abstract data structure with abundant applications, such as social networks, biochemical molecules, and traffic maps. Graph neural networks (GNNs), deep learning tools which adapt to irregular non-Euclidean space, are designed for such graph data with heavy reliance on manual labels. Learning generalizable and reliable representation for unlabeled graph-structured data has become an attractive and trending task in academia because of the promising application scenarios. Recently, numerous SSL-GNN algorithms have been proposed with success on this task. However, the proposed methods are often evaluated with different architecture and evaluation processes on different small-scale datasets, resulting in unreliable model comparisons. To address this problem, this thesis aims to build a benchmark with a unified framework, a standard evaluation process, and replaceable blocks. In this thesis, a benchmark of SSL-GNNs algorithms is created with the implementation of 9 state-of-art algorithms. These algorithms are compared on this benchmark with consistent settings: shared structure of the GNN encoder, pre-training and fine-tuning scheme, and a unified evaluation protocol. Each model is pre-trained on large-scale datasets: ZINC-15 with two million molecular data and then fine-tuned on eight biophysical downstream datasets for the graph classification task. The experiment results support that two of the nine algorithms outperform others under the benchmark set. Furthermore, the comparison between algorithms also shows the correlation between the pre-training dataset and certain fine-tuning datasets, and the correlation is analyzed by the model mechanisms. The implemented benchmark and discoveries in this thesis are expected to promote transfer learning on graph representation learning.

Description

Supervisor

Solin, Arno

Thesis advisor

Verma, Vikas

Keywords

machine learning, benchmark, graph neural networks, self-supervised learning, pre-training and fine-tuning

Other note

Citation