Learning Centre

SuperDCA for genome-wide epistasis analysis

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.author Puranen, Santeri
dc.contributor.author Pesonen, Maiju
dc.contributor.author Pensar, Johan
dc.contributor.author Xu, Yingying
dc.contributor.author Lees, John A.
dc.contributor.author Bentley, Stephen
dc.contributor.author Croucher, Nicholas J
dc.contributor.author Corander, Jukka
dc.date.accessioned 2020-02-03T09:03:55Z
dc.date.available 2020-02-03T09:03:55Z
dc.date.issued 2018-05-29
dc.identifier.citation Puranen , S , Pesonen , M , Pensar , J , Xu , Y , Lees , J A , Bentley , S , Croucher , N J & Corander , J 2018 , ' SuperDCA for genome-wide epistasis analysis ' , Microbial Genomics , vol. 4 , no. 6 . https://doi.org/10.1099/mgen.0.000184 en
dc.identifier.issn 2057-5858
dc.identifier.other PURE UUID: fc897d05-1abf-4037-9f2b-0271700cecb6
dc.identifier.other PURE ITEMURL: https://research.aalto.fi/en/publications/fc897d05-1abf-4037-9f2b-0271700cecb6
dc.identifier.other PURE FILEURL: https://research.aalto.fi/files/40402323/Puranen_et.al_SuperDCA.mgen000184_1.pdf
dc.identifier.uri https://aaltodoc.aalto.fi/handle/123456789/42973
dc.description.abstract The potential for genome-wide modelling of epistasis has recently surfaced given the possibility of sequencing densely sampled populations and the emerging families of statistical interaction models. Direct coupling analysis (DCA) has previously been shown to yield valuable predictions for single protein structures, and has recently been extended to genome-wide analysis of bacteria, identifying novel interactions in the co-evolution between resistance, virulence and core genome elements. However, earlier computational DCA methods have not been scalable to enable model fitting simultaneously to 104–105 polymorphisms, representing the amount of core genomic variation observed in analyses of many bacterial species. Here, we introduce a novel inference method (SuperDCA) that employs a new scoring principle, efficient parallelization, optimization and filtering on phylogenetic information to achieve scalability for up to 105 polymorphisms. Using two large population samples of Streptococcus pneumoniae, we demonstrate the ability of SuperDCA to make additional significant biological findings about this major human pathogen. We also show that our method can uncover signals of selection that are not detectable by genome-wide association analysis, even though our analysis does not require phenotypic measurements. SuperDCA, thus, holds considerable potential in building understanding about numerous organisms at a systems biological level. en
dc.format.mimetype application/pdf
dc.language.iso en en
dc.publisher Microbiology Society
dc.relation.ispartofseries Microbial Genomics en
dc.relation.ispartofseries Volume 4, issue 6 en
dc.rights openAccess en
dc.title SuperDCA for genome-wide epistasis analysis en
dc.type A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä fi
dc.description.version Peer reviewed en
dc.contributor.department Helsinki Institute for Information Technology (HIIT)
dc.contributor.department Centre of Excellence in Computational Inference, COIN
dc.contributor.department University of Helsinki
dc.contributor.department Department of Computer Science
dc.contributor.department New York University
dc.contributor.department Wellcome Trust Sanger Institute
dc.contributor.department Imperial College London
dc.subject.keyword epistasis
dc.subject.keyword linkage disequilibrium
dc.subject.keyword population genomics
dc.identifier.urn URN:NBN:fi:aalto-202002032053
dc.identifier.doi 10.1099/mgen.0.000184
dc.type.version publishedVersion


Files in this item

Files Size Format View

There are no open access files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search archive


Advanced Search

article-iconSubmit a publication

Browse

Statistics