Cargando…

GPrank: an R package for detecting dynamic elements from genome-wide time series

BACKGROUND: Genome-wide high-throughput sequencing (HTS) time series experiments are a powerful tool for monitoring various genomic elements over time. They can be used to monitor, for example, gene or transcript expression with RNA sequencing (RNA-seq), DNA methylation levels with bisulfite sequenc...

Descripción completa

Detalles Bibliográficos
Autores principales: Topa, Hande, Honkela, Antti
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6172792/
https://www.ncbi.nlm.nih.gov/pubmed/30286713
http://dx.doi.org/10.1186/s12859-018-2370-4
_version_ 1783361011754991616
author Topa, Hande
Honkela, Antti
author_facet Topa, Hande
Honkela, Antti
author_sort Topa, Hande
collection PubMed
description BACKGROUND: Genome-wide high-throughput sequencing (HTS) time series experiments are a powerful tool for monitoring various genomic elements over time. They can be used to monitor, for example, gene or transcript expression with RNA sequencing (RNA-seq), DNA methylation levels with bisulfite sequencing (BS-seq), or abundances of genetic variants in populations with pooled sequencing (Pool-seq). However, because of high experimental costs, the time series data sets often consist of a very limited number of time points with very few or no biological replicates, posing challenges in the data analysis. RESULTS: Here we present the GPrank R package for modelling genome-wide time series by incorporating variance information obtained during pre-processing of the HTS data using probabilistic quantification methods or from a beta-binomial model using sequencing depth. GPrank is well-suited for analysing both short and irregularly sampled time series. It is based on modelling each time series by two Gaussian process (GP) models, namely, time-dependent and time-independent GP models, and comparing the evidence provided by data under two models by computing their Bayes factor (BF). Genomic elements are then ranked by their BFs, and temporally most dynamic elements can be identified. CONCLUSIONS: Incorporating the variance information helps GPrank avoid false positives without compromising computational efficiency. Fitted models can be easily further explored in a browser. Detection and visualisation of temporally most active dynamic elements in the genome can provide a good starting point for further downstream analyses for increasing our understanding of the studied processes.
format Online
Article
Text
id pubmed-6172792
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-61727922018-10-15 GPrank: an R package for detecting dynamic elements from genome-wide time series Topa, Hande Honkela, Antti BMC Bioinformatics Software BACKGROUND: Genome-wide high-throughput sequencing (HTS) time series experiments are a powerful tool for monitoring various genomic elements over time. They can be used to monitor, for example, gene or transcript expression with RNA sequencing (RNA-seq), DNA methylation levels with bisulfite sequencing (BS-seq), or abundances of genetic variants in populations with pooled sequencing (Pool-seq). However, because of high experimental costs, the time series data sets often consist of a very limited number of time points with very few or no biological replicates, posing challenges in the data analysis. RESULTS: Here we present the GPrank R package for modelling genome-wide time series by incorporating variance information obtained during pre-processing of the HTS data using probabilistic quantification methods or from a beta-binomial model using sequencing depth. GPrank is well-suited for analysing both short and irregularly sampled time series. It is based on modelling each time series by two Gaussian process (GP) models, namely, time-dependent and time-independent GP models, and comparing the evidence provided by data under two models by computing their Bayes factor (BF). Genomic elements are then ranked by their BFs, and temporally most dynamic elements can be identified. CONCLUSIONS: Incorporating the variance information helps GPrank avoid false positives without compromising computational efficiency. Fitted models can be easily further explored in a browser. Detection and visualisation of temporally most active dynamic elements in the genome can provide a good starting point for further downstream analyses for increasing our understanding of the studied processes. BioMed Central 2018-10-04 /pmc/articles/PMC6172792/ /pubmed/30286713 http://dx.doi.org/10.1186/s12859-018-2370-4 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Topa, Hande
Honkela, Antti
GPrank: an R package for detecting dynamic elements from genome-wide time series
title GPrank: an R package for detecting dynamic elements from genome-wide time series
title_full GPrank: an R package for detecting dynamic elements from genome-wide time series
title_fullStr GPrank: an R package for detecting dynamic elements from genome-wide time series
title_full_unstemmed GPrank: an R package for detecting dynamic elements from genome-wide time series
title_short GPrank: an R package for detecting dynamic elements from genome-wide time series
title_sort gprank: an r package for detecting dynamic elements from genome-wide time series
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6172792/
https://www.ncbi.nlm.nih.gov/pubmed/30286713
http://dx.doi.org/10.1186/s12859-018-2370-4
work_keys_str_mv AT topahande gprankanrpackagefordetectingdynamicelementsfromgenomewidetimeseries
AT honkelaantti gprankanrpackagefordetectingdynamicelementsfromgenomewidetimeseries