Learning subtree pattern importance for Weisfeiler-Lehman based graph kernels

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorNguyen, Dai Haien_US
dc.contributor.authorNguyen, Canh Haoen_US
dc.contributor.authorMamitsuka, Hiroshien_US
dc.contributor.departmentDepartment of Computer Scienceen
dc.contributor.groupauthorProfessorship Kaski Samuelen
dc.contributor.groupauthorHelsinki Institute for Information Technology (HIIT)en
dc.contributor.groupauthorProbabilistic Machine Learningen
dc.contributor.organizationUniversity of Tokyoen_US
dc.contributor.organizationKyoto Universityen_US
dc.date.accessioned2021-08-04T06:45:04Z
dc.date.available2021-08-04T06:45:04Z
dc.date.embargoinfo:eu-repo/date/embargoEnd/2022-06-13en_US
dc.date.issued2021-07en_US
dc.descriptionFunding Information: D. H. N. has been supported in part by Otsuka Toshimi scholarship and JSPS Research Fellowship for Young Scientists (DC2) with KAKENHI [grant number 19J14714]. C. H. N. has been supported in part by MEXT KAKENHI [grant number 18K11434]. H. M. has been supported in part by JST ACCEL [grant number JPMJAC1503], MEXT KAKENHI [grant numbers 16H02868, 19H04169], FiDiPro by Tekes (currently Business Finland) and AIPSE program by Academy of Finland. Publisher Copyright: © 2021, The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature.
dc.description.abstractGraph is an usual representation of relational data, which are ubiquitous in many domains such as molecules, biological and social networks. A popular approach to learning with graph structured data is to make use of graph kernels, which measure the similarity between graphs and are plugged into a kernel machine such as a support vector machine. Weisfeiler-Lehman (WL) based graph kernels, which employ WL labeling scheme to extract subtree patterns and perform node embedding, are demonstrated to achieve great performance while being efficiently computable. However, one of the main drawbacks of a general kernel is the decoupling of kernel construction and learning process. For molecular graphs, usual kernels such as WL subtree, based on substructures of the molecules, consider all available substructures having the same importance, which might not be suitable in practice. In this paper, we propose a method to learn the weights of subtree patterns in the framework of WWL kernels, the state of the art method for graph classification task (Togninalli et al., in: Advances in Neural Information Processing Systems, pp. 6439–6449, 2019). To overcome the computational issue on large scale data sets, we present an efficient learning algorithm and also derive a generalization gap bound to show its convergence. Finally, through experiments on synthetic and real-world data sets, we demonstrate the effectiveness of our proposed method for learning the weights of subtree patterns.en
dc.description.versionPeer revieweden
dc.format.extent23
dc.format.extent1585-1607
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationNguyen, D H, Nguyen, C H & Mamitsuka, H 2021, ' Learning subtree pattern importance for Weisfeiler-Lehman based graph kernels ', Machine Learning, vol. 110, no. 7, pp. 1585-1607 . https://doi.org/10.1007/s10994-021-05991-yen
dc.identifier.doi10.1007/s10994-021-05991-yen_US
dc.identifier.issn0885-6125
dc.identifier.issn1573-0565
dc.identifier.otherPURE UUID: c55528c6-9fed-4f9d-ba38-6022b09a25b7en_US
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/c55528c6-9fed-4f9d-ba38-6022b09a25b7en_US
dc.identifier.otherPURE LINK: http://www.scopus.com/inward/record.url?scp=85107778519&partnerID=8YFLogxKen_US
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/65497721/Learning_subtree_pattern_importance_for_Weisfeiler_Lehman.final_version_mlj.pdfen_US
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/108957
dc.identifier.urnURN:NBN:fi:aalto-202108048201
dc.language.isoenen
dc.publisherSpringer Netherlands
dc.relation.ispartofseriesMachine Learningen
dc.relation.ispartofseriesVolume 110, issue 7en
dc.rightsopenAccessen
dc.subject.keywordGraph kernelen_US
dc.subject.keywordOptimal transporten_US
dc.subject.keywordWeisfeiler Lehman schemeen_US
dc.titleLearning subtree pattern importance for Weisfeiler-Lehman based graph kernelsen
dc.typeA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessäfi

Files