ChromDMM: a Dirichlet-multinomial mixture model for clustering heterogeneous epigenetic data
Loading...
Access rights
openAccess
publishedVersion
URL
Journal Title
Journal ISSN
Volume Title
A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
Date
Department
Major/Subject
Mcode
Degree programme
Language
en
Pages
8
Series
Bioinformatics, Volume 38, issue 16, pp. 3863-3870
Abstract
Motivation: Research on epigenetic modifications and other chromatin features at genomic regulatory elements elucidates essential biological mechanisms including the regulation of gene expression. Despite the growing number of epigenetic datasets, new tools are still needed to discover novel distinctive patterns of heterogeneous epigenetic signals at regulatory elements. Results: We introduce ChromDMM, a product Dirichlet-multinomial mixture model for clustering genomic regions that are characterized by multiple chromatin features. ChromDMM extends the mixture model framework by profile shifting and flipping that can probabilistically account for inaccuracies in the position and strand-orientation of the genomic regions. Owing to hyper-parameter optimization, ChromDMM can also regularize the smoothness of the epigenetic profiles across the consecutive genomic regions. With simulated data, we demonstrate that ChromDMM clusters, shifts and strand-orients the profiles more accurately than previous methods. With ENCODE data, we show that the clustering of enhancer regions in the human genome reveals distinct patterns in several chromatin features. We further validate the enhancer clusters by their enrichment for transcriptional regulatory factor binding sites.Description
Publisher Copyright: © 2022 The Author(s). Published by Oxford University Press.
Keywords
Other note
Citation
Osmala, M, Eraslan, G & Lähdesmäki, H 2022, 'ChromDMM: a Dirichlet-multinomial mixture model for clustering heterogeneous epigenetic data', Bioinformatics, vol. 38, no. 16, pp. 3863-3870. https://doi.org/10.1093/bioinformatics/btac444