Improved Small Molecule Identification through Learning Combinations of Kernel Regression Models

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorBrouard, Celine
dc.contributor.authorBasse, Antoine
dc.contributor.authord'Alche-Buc, Florence
dc.contributor.authorRousu, Juho
dc.contributor.departmentInstitut National de la Recherche Agronomique
dc.contributor.departmentInstitut Polytechnique de Paris
dc.contributor.departmentProfessorship Rousu Juho
dc.contributor.departmentDepartment of Computer Scienceen
dc.date.accessioned2019-09-20T11:06:47Z
dc.date.available2019-09-20T11:06:47Z
dc.date.issued2019-08
dc.description.abstractIn small molecule identification from tandem mass (MS/MS) spectra, input-output kernel regression (IOKR) currently provides the state-of-the-art combination of fast training and prediction and high identification rates. The IOKR approach can be simply understood as predicting a fingerprint vector from the MS/MS spectrum of the unknown molecule, and solving a pre-image problem to find the molecule with the most similar fingerprint. In this paper, we bring forward the following improvements to the IOKR framework: firstly, we formulate the IOKRreverse model that can be understood as mapping molecular structures into the MS/MS feature space and solving a pre-image problem to find the molecule whose predicted spectrum is the closest to the input MS/MS spectrum. Secondly, we introduce an approach to combine several IOKR and IOKRreverse models computed from different input and output kernels, called IOKRfusion. The method is based on minimizing structured Hinge loss of the combined model using a mini-batch stochastic subgradient optimization. Our experiments show a consistent improvement of top-k accuracy both in positive and negative ionization mode data.en
dc.description.versionPeer revieweden
dc.format.extent14
dc.format.mimetypeapplication/pdf
dc.identifier.citationBrouard , C , Basse , A , d'Alche-Buc , F & Rousu , J 2019 , ' Improved Small Molecule Identification through Learning Combinations of Kernel Regression Models ' , METABOLITES , vol. 9 , no. 8 , 160 . https://doi.org/10.3390/metabo9080160en
dc.identifier.doi10.3390/metabo9080160
dc.identifier.issn2218-1989
dc.identifier.otherPURE UUID: 02c66961-2c2f-4828-b323-2ecaf945abfb
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/02c66961-2c2f-4828-b323-2ecaf945abfb
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/36899371/metabolites_09_00160_v2.pdf
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/40274
dc.identifier.urnURN:NBN:fi:aalto-202306053527
dc.language.isoenen
dc.publisherMultidisciplinary Digital Publishing Institute (MDPI)
dc.relation.ispartofseriesMETABOLITESen
dc.relation.ispartofseriesVolume 9, issue 8en
dc.rightsopenAccessen
dc.subject.keywordmetabolite identification
dc.subject.keywordmachine learning
dc.subject.keywordstructured prediction
dc.subject.keywordkernel methods
dc.subject.keywordMETABOLITE IDENTIFICATION
dc.subject.keywordPREDICTION
dc.titleImproved Small Molecule Identification through Learning Combinations of Kernel Regression Modelsen
dc.typeA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessäfi
dc.type.versionpublishedVersion
Files