A Dirichlet-Multinomial Mixture Model For Clustering Heterogeneous Epigenomics Data

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.advisor Osmala, Maria
dc.contributor.author Eraslan, Gokcen
dc.date.accessioned 2014-10-03T07:44:48Z
dc.date.available 2014-10-03T07:44:48Z
dc.date.issued 2014-09-29
dc.identifier.uri https://aaltodoc.aalto.fi/handle/123456789/14128
dc.description.abstract Epigenetic information sheds light on essential biological mechanisms including the regulation of gene expression. Among the major epigenetic mechanisms are histone tail modifications which can be utilized to identify cis-regulatory elements such as promoters and enhancers. Nucleosome positions and open chromatin regions are other key elements of the epigenomic landscape. Thanks to the advances in high-throughput sequencing technologies, comprehensive genome-wide analyses of epigenetic signatures are possible at present. Despite the growing number of epigenetic datasets, the tools to discover novel patterns and combinatorial presence of epigenetic elements are still needed. In this thesis, we introduce a model-based clustering approach that uncovers epigenetic patterns by integrating multiple data tracks in a multi-view fashion where different views correspond to different epigenetic signals extracted from the same genomic location. Moreover, to address the inaccuracy of the positions of anchor points, such as TF ChIP-seq peak summits or TSS, a profile shifting feature is implemented. Finally, owing to the hyperprior regularization, our approach can also account for the correlation between the number of reads mapped to consecutive base pair positions. We demonstrate that the genome-wide clustering of promoter and enhancer regions in human genome reveals distinct patterns in various histone modification and transcription factor ChIP-seq profiles. Furthermore, TFBS enrichment in different classes of enhancers and promoters that are identified by our method is investigated which shows that some transcription factors are significantly enriched in a subset of enhancer and promoter clusters. en
dc.format.extent 72 + 6
dc.format.mimetype application/pdf en
dc.language.iso en en
dc.title A Dirichlet-Multinomial Mixture Model For Clustering Heterogeneous Epigenomics Data en
dc.type G2 Pro gradu, diplomityö en
dc.contributor.school Perustieteiden korkeakoulu fi
dc.subject.keyword chromatin en
dc.subject.keyword enhancers en
dc.subject.keyword promoters en
dc.subject.keyword multi-view clustering en
dc.subject.keyword histone modifications en
dc.subject.keyword epigenomics en
dc.subject.keyword generative models en
dc.subject.keyword Dirichlet-multinomial en
dc.subject.keyword mixture model en
dc.identifier.urn URN:NBN:fi:aalto-201410062747
dc.programme.major Computational Systems Biology fi
dc.programme.mcode IL3013 fi
dc.type.ontasot Master's thesis en
dc.type.ontasot Diplomityö fi
dc.contributor.supervisor Lähdesmäki, Harri
dc.programme Master's Degree Programme in Computational and Systems Biology (euSYSBIO) fi


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search archive


Advanced Search

article-iconSubmit a publication

Browse

My Account