DeepGraphGO: Graph neural network for large-scale, multispecies protein function prediction

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorYou, Ronghuien_US
dc.contributor.authorYao, Shuweien_US
dc.contributor.authorMamitsuka, Hiroshien_US
dc.contributor.authorZhu, Shanfengen_US
dc.contributor.departmentDepartment of Computer Scienceen
dc.contributor.groupauthorProbabilistic Machine Learningen
dc.contributor.groupauthorHelsinki Institute for Information Technology (HIIT)en
dc.contributor.groupauthorProfessorship Kaski Samuelen
dc.contributor.organizationFudan Universityen_US
dc.date.accessioned2021-08-25T06:54:00Z
dc.date.available2021-08-25T06:54:00Z
dc.date.issued2021-07-01en_US
dc.descriptionPublisher Copyright: © 2021 Oxford University Press. All rights reserved.
dc.description.abstractMotivation: Automated function prediction (AFP) of proteins is a large-scale multi-label classification problem. Two limitations of most network-based methods for AFP are (i) a single model must be trained for each species and (ii) protein sequence information is totally ignored. These limitations cause weaker performance than sequence-based methods. Thus, the challenge is how to develop a powerful network-based method for AFP to overcome these limitations. Results: We propose DeepGraphGO, an end-to-end, multispecies graph neural network-based method for AFP, which makes the most of both protein sequence and high-order protein network information. Our multispecies strategy allows one single model to be trained for all species, indicating a larger number of training samples than existing methods. Extensive experiments with a large-scale dataset show that DeepGraphGO outperforms a number of competing state-of-the-art methods significantly, including DeepGOPlus and three representative network-based methods: GeneMANIA, deepNF and clusDCA. We further confirm the effectiveness of our multispecies strategy and the advantage of DeepGraphGO over so-called difficult proteins. Finally, we integrate DeepGraphGO into the stateof- the-art ensemble method, NetGO, as a component and achieve a further performance improvement. Availability and implementation: https://github.com/yourh/DeepGraphGO.en
dc.description.versionPeer revieweden
dc.format.extentI262-I271
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationYou, R, Yao, S, Mamitsuka, H & Zhu, S 2021, ' DeepGraphGO : Graph neural network for large-scale, multispecies protein function prediction ', Bioinformatics, vol. 37, pp. I262-I271 . https://doi.org/10.1093/bioinformatics/btab270en
dc.identifier.doi10.1093/bioinformatics/btab270en_US
dc.identifier.issn1367-4803
dc.identifier.issn1460-2059
dc.identifier.otherPURE UUID: bede6bf7-b3d8-4ad7-8ceb-4d7982649482en_US
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/bede6bf7-b3d8-4ad7-8ceb-4d7982649482en_US
dc.identifier.otherPURE LINK: http://www.scopus.com/inward/record.url?scp=85112020570&partnerID=8YFLogxKen_US
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/66692009/DeepGraphGO.btab270.pdfen_US
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/109175
dc.identifier.urnURN:NBN:fi:aalto-202108258412
dc.language.isoenen
dc.publisherOXFORD UNIV PRESS INC
dc.relation.ispartofseriesBioinformaticsen
dc.relation.ispartofseriesVolume 37en
dc.rightsopenAccessen
dc.titleDeepGraphGO: Graph neural network for large-scale, multispecies protein function predictionen
dc.typeA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessäfi
dc.type.versionpublishedVersion
Files