Modular discovery of monomeric and dimeric transcription factor binding motifs for large data sets

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en Toivonen, Jarkko Kivioja, Teemu Jolma, Arttu Yin, Yimeng Taipale, Jussi Ukkonen, Esko 2018-09-06T10:17:19Z 2018-09-06T10:17:19Z 2018-05-04
dc.identifier.citation Toivonen , J , Kivioja , T , Jolma , A , Yin , Y , Taipale , J & Ukkonen , E 2018 , ' Modular discovery of monomeric and dimeric transcription factor binding motifs for large data sets ' NUCLEIC ACIDS RESEARCH , vol 46 , no. 8 . DOI: 10.1093/nar/gky027 en
dc.identifier.issn 0305-1048
dc.identifier.issn 1362-4962
dc.identifier.other PURE UUID: c73d1820-3d39-4aa1-9fcb-159e90d09db9
dc.identifier.other PURE ITEMURL:
dc.identifier.other PURE FILEURL:
dc.description.abstract In some dimeric cases of transcription factor (TF) binding, the specificity of dimeric motifs has been observed to differ notably from what would be expected were the two factors to bind to DNA independently of each other. Current motif discovery methods are unable to learn monomeric and dimeric motifs in modular fashion such that deviations from the expected motif would become explicit and the noise from dimeric occurrences would not corrupt monomeric models. We propose a novel modeling technique and an expectation maximization algorithm, implemented as software tool MODER, for discovering monomeric TF binding motifs and their dimeric combinations. Given training data and seeds for monomeric motifs, the algorithm learns in the same probabilistic framework a mixture model which represents monomeric motifs as standard position-specific probability matrices (PPMs), and dimeric motifs as pairs of monomeric PPMs, with associated orientation and spacing preferences. For dimers the model represents deviations from pure modular model of two independent monomers, thus making co-operative binding effects explicit. MODER can analyze in reasonable time tens of Mbps of training data. We validated the tool on HT-SELEX and ChIP-seq data. Our findings include some TFs whose expected model has palindromic symmetry but the observed model is directional. en
dc.format.extent 16
dc.format.mimetype application/pdf
dc.language.iso en en
dc.relation.ispartofseries NUCLEIC ACIDS RESEARCH en
dc.relation.ispartofseries Volume 46, issue 8 en
dc.rights openAccess en
dc.subject.other 515 Psychology en
dc.title Modular discovery of monomeric and dimeric transcription factor binding motifs for large data sets en
dc.type A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä fi
dc.description.version Peer reviewed en
dc.contributor.department Helsinki Institute for Information Technology HIIT
dc.contributor.department University of Helsinki
dc.contributor.department Karolinska Institutet
dc.contributor.department University of Cambridge
dc.contributor.department Aalto University
dc.contributor.department Department of Computer Science en
dc.subject.keyword CHIP-SEQ DATA
dc.subject.keyword EM ALGORITHM
dc.subject.keyword DNA-BINDING
dc.subject.keyword HUMAN GENOME
dc.subject.keyword SITES
dc.subject.keyword SEQUENCE
dc.subject.keyword IDENTIFICATION
dc.subject.keyword SPECIFICITIES
dc.subject.keyword ALIGNMENT
dc.subject.keyword 515 Psychology
dc.identifier.urn URN:NBN:fi:aalto-201809064985
dc.identifier.doi 10.1093/nar/gky027
dc.type.version publishedVersion

Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search archive

Advanced Search

article-iconSubmit a publication


My Account