MixChIP: a probabilistic method for cell type specific protein-DNA binding analysis

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorRautio, Sinien_US
dc.contributor.authorLähdesmäki, Harrien_US
dc.contributor.departmentDepartment of Computer Scienceen
dc.date.accessioned2016-12-16T14:09:47Z
dc.date.issued2015-12-24en_US
dc.description.abstractBACKGROUND: Transcription factors (TFs) are proteins that bind to DNA and regulate gene expression. To understand details of gene regulation, characterizing TF binding sites in different cell types, diseases and among individuals is essential. However, sometimes TF binding can only be measured from biological samples that contain multiple cell or tissue types. Sample heterogeneity can have a considerable effect on TF binding site detection. While manual separation techniques can be used to isolate a cell type of interest from heterogeneous samples, such techniques are challenging and can change intra-cellular interactions, including protein-DNA binding. Computational deconvolution methods have emerged as an alternative strategy to study heterogeneous samples and numerous methods have been proposed to analyze gene expression. However, no computational method exists to deconvolve cell type specific TF binding from heterogeneous samples. RESULTS: We present a probabilistic method, MixChIP, to identify cell type specific TF binding sites from heterogeneous chromatin immunoprecipitation sequencing (ChIP-seq) data. Our method simultaneously estimates the binding strength in different cell types as well as the proportions of different cell types in each sample when only partial prior information about cell type composition is available. We demonstrate the utility of MixChIP by analyzing ChIP-seq data from two cell lines which we artificially mix to generate (simulated) heterogeneous samples and by analyzing ChIP-seq data from breast cancer patients measuring oestrogen receptor (ER) binding in primary breast cancer tissues. We show that MixChIP is more accurate in detecting TF binding sites from multiple heterogeneous ChIP-seq samples than the standard methods which do not account for sample heterogeneity. CONCLUSIONS: Our results show that MixChIP can estimate cell-type proportions and identify cell type specific TF binding sites from heterogeneous ChIP-seq samples. Thus, MixChIP can be an invaluable tool in analyzing heterogeneous ChIP-seq samples, such as those originating from cancer studies. R implementation is available at http://research.ics.aalto.fi/csb/software/mixchip/ .en
dc.description.versionPeer revieweden
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationRautio, S & Lähdesmäki, H 2015, 'MixChIP : a probabilistic method for cell type specific protein-DNA binding analysis', BMC Bioinformatics. https://doi.org/10.1186/s12859-015-0834-3en
dc.identifier.doi10.1186/s12859-015-0834-3en_US
dc.identifier.issn1471-2105
dc.identifier.otherPURE UUID: 5579cd2b-b73d-4e1f-8b3d-4f2014fa22cben_US
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/5579cd2b-b73d-4e1f-8b3d-4f2014fa22cben_US
dc.identifier.otherPURE FILEURL: https://research.aalto.fi/files/9934788/art_3A10.1186_2Fs12859_015_0834_3.pdfen_US
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/23789
dc.identifier.urnURN:NBN:fi:aalto-201612165966
dc.language.isoenen
dc.publisherBioMed Central
dc.relation.ispartofseriesBMC Bioinformaticsen
dc.rightsopenAccessen
dc.titleMixChIP: a probabilistic method for cell type specific protein-DNA binding analysisen
dc.typeA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessäfi
dc.type.versionpublishedVersion

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
art_3A10.1186_2Fs12859_015_0834_3.pdf
Size:
1.5 MB
Format:
Adobe Portable Document Format