Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorEldjarn, Grimur Hjorleifsson
dc.contributor.authorRamsay, Andrew
dc.contributor.authorVan Der Hooft, Justin J.J.
dc.contributor.authorDuncan, Katherine R.
dc.contributor.authorSoldatou, Sylvia
dc.contributor.authorRousu, Juho
dc.contributor.authorDaly, Ronan
dc.contributor.authorWandy, Joe
dc.contributor.authorRogers, Simon
dc.contributor.departmentUniversity of Glasgow
dc.contributor.departmentWageningen University and Research Centre
dc.contributor.departmentUniversity of Strathclyde
dc.contributor.departmentRobert Gordon University
dc.contributor.departmentComputer Science Professors
dc.contributor.departmentDepartment of Computer Scienceen
dc.date.accessioned2022-01-12T07:18:26Z
dc.date.available2022-01-12T07:18:26Z
dc.date.issued2021-05
dc.descriptionPublisher Copyright: © 2021 Public Library of Science. All rights reserved.
dc.description.abstractSpecialised metabolites from microbial sources are well-known for their wide range of biomedical applications, particularly as antibiotics. When mining paired genomic and metabolomic data sets for novel specialised metabolites, establishing links between Biosynthetic Gene Clusters (BGCs) and metabolites represents a promising way of finding such novel chemistry. However, due to the lack of detailed biosynthetic knowledge for the majority of predicted BGCs, and the large number of possible combinations, this is not a simple task. This problem is becoming ever more pressing with the increased availability of paired omics data sets. Current tools are not effective at identifying valid links automatically, and manual verification is a considerable bottleneck in natural product research. We demonstrate that using multiple link-scoring functions together makes it easier to prioritise true links relative to others. Based on standardising a commonly used score, we introduce a new, more effective score, and introduce a novel score using an Input-Output Kernel Regression approach. Finally, we present NPLinker, a software framework to link genomic and metabolomic data. Results are verified using publicly available data sets that include validated links.en
dc.description.versionPeer revieweden
dc.format.extent24
dc.format.extent1-24
dc.format.mimetypeapplication/pdf
dc.identifier.citationEldjarn , G H , Ramsay , A , Van Der Hooft , J J J , Duncan , K R , Soldatou , S , Rousu , J , Daly , R , Wandy , J & Rogers , S 2021 , ' Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions ' , PLoS computational biology , vol. 17 , no. 5 , e1008920 , pp. 1-24 . https://doi.org/10.1371/journal.pcbi.1008920en
dc.identifier.doi10.1371/journal.pcbi.1008920
dc.identifier.issn1553-734X
dc.identifier.issn1553-7358
dc.identifier.otherPURE UUID: 7d269324-499b-49d0-a5a7-e39d89fc1322
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/7d269324-499b-49d0-a5a7-e39d89fc1322
dc.identifier.otherPURE LINK: http://www.scopus.com/inward/record.url?scp=85105752089&partnerID=8YFLogxK
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/78004392/Ranking_microbial_metabolomic_and_genomic_links_in_the_NPLinker_framework_using_complementary_scoring_functions.pdf
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/112265
dc.identifier.urnURN:NBN:fi:aalto-202201121173
dc.language.isoenen
dc.publisherPublic Library of Science
dc.relation.ispartofseriesPLoS computational biologyen
dc.relation.ispartofseriesVolume 17en
dc.rightsopenAccessen
dc.titleRanking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functionsen
dc.typeA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessäfi
dc.type.versionpublishedVersion
Files