Learning Centre

GPrank

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.author Topa, Hande
dc.contributor.author Honkela, Antti
dc.date.accessioned 2018-10-24T09:39:20Z
dc.date.available 2018-10-24T09:39:20Z
dc.date.issued 2018-10-04
dc.identifier.citation Topa , H & Honkela , A 2018 , ' GPrank : An R package for detecting dynamic elements from genome-wide time series ' , BMC Bioinformatics , vol. 19 , no. 1 , 367 , pp. 1-6 . https://doi.org/10.1186/s12859-018-2370-4 en
dc.identifier.issn 1471-2105
dc.identifier.other PURE UUID: 33a99b12-24fc-4ada-9712-a275587de1c0
dc.identifier.other PURE ITEMURL: https://research.aalto.fi/en/publications/33a99b12-24fc-4ada-9712-a275587de1c0
dc.identifier.other PURE LINK: http://www.scopus.com/inward/record.url?scp=85054451238&partnerID=8YFLogxK
dc.identifier.other PURE FILEURL: https://research.aalto.fi/files/28765952/s12859_018_2370_4.pdf
dc.identifier.uri https://aaltodoc.aalto.fi/handle/123456789/34452
dc.description.abstract Background: Genome-wide high-throughput sequencing (HTS) time series experiments are a powerful tool for monitoring various genomic elements over time. They can be used to monitor, for example, gene or transcript expression with RNA sequencing (RNA-seq), DNA methylation levels with bisulfite sequencing (BS-seq), or abundances of genetic variants in populations with pooled sequencing (Pool-seq). However, because of high experimental costs, the time series data sets often consist of a very limited number of time points with very few or no biological replicates, posing challenges in the data analysis. Results: Here we present the GPrank R package for modelling genome-wide time series by incorporating variance information obtained during pre-processing of the HTS data using probabilistic quantification methods or from a beta-binomial model using sequencing depth. GPrank is well-suited for analysing both short and irregularly sampled time series. It is based on modelling each time series by two Gaussian process (GP) models, namely, time-dependent and time-independent GP models, and comparing the evidence provided by data under two models by computing their Bayes factor (BF). Genomic elements are then ranked by their BFs, and temporally most dynamic elements can be identified. Conclusions: Incorporating the variance information helps GPrank avoid false positives without compromising computational efficiency. Fitted models can be easily further explored in a browser. Detection and visualisation of temporally most active dynamic elements in the genome can provide a good starting point for further downstream analyses for increasing our understanding of the studied processes. en
dc.format.extent 1-6
dc.format.mimetype application/pdf
dc.language.iso en en
dc.relation.ispartofseries BMC Bioinformatics en
dc.relation.ispartofseries Volume 19, issue 1 en
dc.rights openAccess en
dc.title GPrank en
dc.type A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä fi
dc.description.version Peer reviewed en
dc.contributor.department Department of Computer Science
dc.contributor.department University of Helsinki
dc.subject.keyword Bayes factor
dc.subject.keyword Gaussian process
dc.subject.keyword High-throughput sequencing
dc.subject.keyword R
dc.subject.keyword Ranking
dc.subject.keyword Time series
dc.subject.keyword Visualization
dc.identifier.urn URN:NBN:fi:aalto-201810245514
dc.identifier.doi 10.1186/s12859-018-2370-4
dc.type.version publishedVersion


Files in this item

Files Size Format View

There are no open access files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search archive


Advanced Search

article-iconSubmit a publication

Browse

Statistics