ChromDMM: a Dirichlet-multinomial mixture model for clustering heterogeneous epigenetic data

Loading...
Thumbnail Image

Access rights

openAccess
publishedVersion

URL

Journal Title

Journal ISSN

Volume Title

A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä

Major/Subject

Mcode

Degree programme

Language

en

Pages

8

Series

Bioinformatics, Volume 38, issue 16, pp. 3863-3870

Abstract

Motivation: Research on epigenetic modifications and other chromatin features at genomic regulatory elements elucidates essential biological mechanisms including the regulation of gene expression. Despite the growing number of epigenetic datasets, new tools are still needed to discover novel distinctive patterns of heterogeneous epigenetic signals at regulatory elements. Results: We introduce ChromDMM, a product Dirichlet-multinomial mixture model for clustering genomic regions that are characterized by multiple chromatin features. ChromDMM extends the mixture model framework by profile shifting and flipping that can probabilistically account for inaccuracies in the position and strand-orientation of the genomic regions. Owing to hyper-parameter optimization, ChromDMM can also regularize the smoothness of the epigenetic profiles across the consecutive genomic regions. With simulated data, we demonstrate that ChromDMM clusters, shifts and strand-orients the profiles more accurately than previous methods. With ENCODE data, we show that the clustering of enhancer regions in the human genome reveals distinct patterns in several chromatin features. We further validate the enhancer clusters by their enrichment for transcriptional regulatory factor binding sites.

Description

Publisher Copyright: © 2022 The Author(s). Published by Oxford University Press.

Keywords

Other note

Citation

Osmala, M, Eraslan, G & Lähdesmäki, H 2022, 'ChromDMM: a Dirichlet-multinomial mixture model for clustering heterogeneous epigenetic data', Bioinformatics, vol. 38, no. 16, pp. 3863-3870. https://doi.org/10.1093/bioinformatics/btac444