MLapSVM-LBS: Predicting DNA-binding proteins via a multiple Laplacian regularized support vector machine with local behavior similarity
Loading...
Access rights
openAccess
publishedVersion
URL
Journal Title
Journal ISSN
Volume Title
A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
Date
2022-08-17
Department
Major/Subject
Mcode
Degree programme
Language
en
Pages
8
Series
Knowledge-Based Systems, Volume 250, pp. 1-8
Abstract
DNA-binding proteins (DBPs) are of great significance in many basic cellular processes. Experiment-based methods for identifying DBPs are costly and time-consuming. To deal with large-scale DBP identification tasks, a variety of computation-based methods have been developed. Inspired by previous work, we propose a multiple Laplacian regularized support vector machine with local behavior similarity (MLapSVM-LBS) to predict DBP. We serially combine three features that are extracted from protein sequences (including PsePSSM, GE, NMBAC) and feed them into MLapSVM-LBS. Based on human behavior learning theory, MLapSVM-LBS can better represent the relationship between samples through local behavior similarity. We introduce a new edge weight calculation method that takes label information into consideration. In addition, a local distribution parameter reflecting the underlying probability distribution of a sample's neighborhood is also employed. To further improve the robustness of the model, we utilize multiple Laplacian regularization to build a multigraph model in which five Laplacian graphs are constructed with local behavior similarity by changing the neighborhood size. To appraise the performance of our model, MLapSVM-LBS is trained and tested on the PDB186, PDB1075, PDB2272 and PDB14189 datasets. On two independent testing sets (PDB186 and PDB2272), our method reaches the accuracies of 0.887 and 0.712, respectively. The good results on both datasets demonstrate the reliable performance of our model.Description
| openaire: EC/H2020/101016775/EU//INTERVENE Funding Information: This work is supported by a grant from the National Natural Science Foundation of China (NSFC 62172076 , 61902271 ), the Academy of Finland (grants 336033 , 315896 ), Business Finland (grant 884/31/2018 ), EU H2020 (grant 101016775 ), the Natural Science Research of Jiangsu Higher Education Institutions of China ( 19KJB520014 ) and the Municipal Government of Quzhou (Grant Number 2020D003 and 2021D004 ). Publisher Copyright: © 2022 The Author(s)
Keywords
DNA-binding proteins, Laplacian support vector machine, Multiple view, Protein feature extraction, Sequence classification
Other note
Citation
Sun, M, Tiwari, P, Qian, Y, Ding, Y & Zou, Q 2022, ' MLapSVM-LBS: Predicting DNA-binding proteins via a multiple Laplacian regularized support vector machine with local behavior similarity ', Knowledge-Based Systems, vol. 250, 109174, pp. 1-8 . https://doi.org/10.1016/j.knosys.2022.109174