Identification of DNA N4-methylcytosine sites via multi-view kernel sparse representation model
No Thumbnail Available
Access rights
openAccess
Journal Title
Journal ISSN
Volume Title
A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä
This publication is imported from Aalto University research portal.
View publication in the Research portal
View/Open full text file from the Research portal
Other link related to publication
View publication in the Research portal
View/Open full text file from the Research portal
Other link related to publication
Date
2023-10
Department
Major/Subject
Mcode
Degree programme
Language
en
Pages
10
Series
IEEE Transactions on Artificial Intelligence, articlenumber 9809784
Abstract
Identifying DNA N4-methylcytosine (4mC) sites is of great significance in biological research, such as chromatin structure, DNA stability, DNA-protein interaction and controlling gene expression. However, the traditional sequencing technology to identify 4mC sites is very time-consuming. In order to detect 4mC sites, we develop a multi-view learning method for achieving more effectively via merging multiple feature spaces. Furthermore, we think about whether the multi-view learning method can improve the across species classification ability by fusing data of multiple species. In our study, we propose a multi-view Laplacian kernel sparse representation-based classifier, called MvLapKSRC-HSIC. First, we make use of three feature extraction methods (PSTNP, NCP, DPP) to extract the DNA sequence features. MvLapKSRC-HSIC uses a kernel sparse representation-based classifier with graph regularization. In order to maintain the independence between various views, we add a multi-view regularization term constructed by Hilbert-Schmidt independence criterion (HSIC). In the experiments, MvLapKSRC-HSIC is applied on six datasets, so as to compare with other popular methods in single species and cross-species experiments. All experimental results show that MvLapKSRC-HSIC is superior to other outstanding methods on both single species and cross-species. Importantly, MvLapKSRC-HSIC can identify a series of potential DNA 4mC sites, which have not yet been experimentally evaluate on multiple species and merit further research.Description
| openaire: EC/H2020/101016775/EU//INTERVENE
Keywords
DNA, Kernel, Feature extraction, Training, Laplace equations, Learning systems, Support vector machines
Other note
Citation
Ai, C, Tiwari, P, Yang, H, Ding, Y, Tang, J & Guo, F 2023, ' Identification of DNA N4-methylcytosine sites via multi-view kernel sparse representation model ', IEEE Transactions on Artificial Intelligence, vol. 4, no. 5, 9809784, pp. 1236-1245 . https://doi.org/10.1109/TAI.2022.3187060