Cargando…

CAGEfightR: analysis of 5′-end data using R/Bioconductor

BACKGROUND: 5′-end sequencing assays, and Cap Analysis of Gene Expression (CAGE) in particular, have been instrumental in studying transcriptional regulation. 5′-end methods provide genome-wide maps of transcription start sites (TSSs) with base pair resolution. Because active enhancers often feature...

Descripción completa

Detalles Bibliográficos
Autores principales: Thodberg, Malte, Thieffry, Axel, Vitting-Seerup, Kristoffer, Andersson, Robin, Sandelin, Albin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6778389/
https://www.ncbi.nlm.nih.gov/pubmed/31585526
http://dx.doi.org/10.1186/s12859-019-3029-5
_version_ 1783456755702824960
author Thodberg, Malte
Thieffry, Axel
Vitting-Seerup, Kristoffer
Andersson, Robin
Sandelin, Albin
author_facet Thodberg, Malte
Thieffry, Axel
Vitting-Seerup, Kristoffer
Andersson, Robin
Sandelin, Albin
author_sort Thodberg, Malte
collection PubMed
description BACKGROUND: 5′-end sequencing assays, and Cap Analysis of Gene Expression (CAGE) in particular, have been instrumental in studying transcriptional regulation. 5′-end methods provide genome-wide maps of transcription start sites (TSSs) with base pair resolution. Because active enhancers often feature bidirectional TSSs, such data can also be used to predict enhancer candidates. The current availability of mature and comprehensive computational tools for the analysis of 5′-end data is limited, preventing efficient analysis of new and existing 5′-end data. RESULTS: We present CAGEfightR, a framework for analysis of CAGE and other 5′-end data implemented as an R/Bioconductor-package. CAGEfightR can import data from BigWig files and allows for fast and memory efficient prediction and analysis of TSSs and enhancers. Downstream analyses include quantification, normalization, annotation with transcript and gene models, TSS shape statistics, linking TSSs to enhancers via co-expression, identification of enhancer clusters, and genome-browser style visualization. While built to analyze CAGE data, we demonstrate the utility of CAGEfightR in analyzing nascent RNA 5′-data (PRO-Cap). CAGEfightR is implemented using standard Bioconductor classes, making it easy to learn, use and combine with other Bioconductor packages, for example popular differential expression tools such as limma, DESeq2 and edgeR. CONCLUSIONS: CAGEfightR provides a single, scalable and easy-to-use framework for comprehensive downstream analysis of 5′-end data. CAGEfightR is designed to be interoperable with other Bioconductor packages, thereby unlocking hundreds of mature transcriptomic analysis tools for 5′-end data. CAGEfightR is freely available via Bioconductor: bioconductor.org/packages/CAGEfightR. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-3029-5) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6778389
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-67783892019-10-07 CAGEfightR: analysis of 5′-end data using R/Bioconductor Thodberg, Malte Thieffry, Axel Vitting-Seerup, Kristoffer Andersson, Robin Sandelin, Albin BMC Bioinformatics Software BACKGROUND: 5′-end sequencing assays, and Cap Analysis of Gene Expression (CAGE) in particular, have been instrumental in studying transcriptional regulation. 5′-end methods provide genome-wide maps of transcription start sites (TSSs) with base pair resolution. Because active enhancers often feature bidirectional TSSs, such data can also be used to predict enhancer candidates. The current availability of mature and comprehensive computational tools for the analysis of 5′-end data is limited, preventing efficient analysis of new and existing 5′-end data. RESULTS: We present CAGEfightR, a framework for analysis of CAGE and other 5′-end data implemented as an R/Bioconductor-package. CAGEfightR can import data from BigWig files and allows for fast and memory efficient prediction and analysis of TSSs and enhancers. Downstream analyses include quantification, normalization, annotation with transcript and gene models, TSS shape statistics, linking TSSs to enhancers via co-expression, identification of enhancer clusters, and genome-browser style visualization. While built to analyze CAGE data, we demonstrate the utility of CAGEfightR in analyzing nascent RNA 5′-data (PRO-Cap). CAGEfightR is implemented using standard Bioconductor classes, making it easy to learn, use and combine with other Bioconductor packages, for example popular differential expression tools such as limma, DESeq2 and edgeR. CONCLUSIONS: CAGEfightR provides a single, scalable and easy-to-use framework for comprehensive downstream analysis of 5′-end data. CAGEfightR is designed to be interoperable with other Bioconductor packages, thereby unlocking hundreds of mature transcriptomic analysis tools for 5′-end data. CAGEfightR is freely available via Bioconductor: bioconductor.org/packages/CAGEfightR. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-3029-5) contains supplementary material, which is available to authorized users. BioMed Central 2019-10-04 /pmc/articles/PMC6778389/ /pubmed/31585526 http://dx.doi.org/10.1186/s12859-019-3029-5 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Thodberg, Malte
Thieffry, Axel
Vitting-Seerup, Kristoffer
Andersson, Robin
Sandelin, Albin
CAGEfightR: analysis of 5′-end data using R/Bioconductor
title CAGEfightR: analysis of 5′-end data using R/Bioconductor
title_full CAGEfightR: analysis of 5′-end data using R/Bioconductor
title_fullStr CAGEfightR: analysis of 5′-end data using R/Bioconductor
title_full_unstemmed CAGEfightR: analysis of 5′-end data using R/Bioconductor
title_short CAGEfightR: analysis of 5′-end data using R/Bioconductor
title_sort cagefightr: analysis of 5′-end data using r/bioconductor
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6778389/
https://www.ncbi.nlm.nih.gov/pubmed/31585526
http://dx.doi.org/10.1186/s12859-019-3029-5
work_keys_str_mv AT thodbergmalte cagefightranalysisof5enddatausingrbioconductor
AT thieffryaxel cagefightranalysisof5enddatausingrbioconductor
AT vittingseerupkristoffer cagefightranalysisof5enddatausingrbioconductor
AT anderssonrobin cagefightranalysisof5enddatausingrbioconductor
AT sandelinalbin cagefightranalysisof5enddatausingrbioconductor