Cargando…

Zipper plot: visualizing transcriptional activity of genomic regions

BACKGROUND: Reconstructing transcript models from RNA-sequencing (RNA-seq) data and establishing these as independent transcriptional units can be a challenging task. Current state-of-the-art tools for long non-coding RNA (lncRNA) annotation are mainly based on evolutionary constraints, which may re...

Descripción completa

Detalles Bibliográficos
Autores principales: Avila Cobos, Francisco, Anckaert, Jasper, Volders, Pieter-Jan, Everaert, Celine, Rombaut, Dries, Vandesompele, Jo, De Preter, Katleen, Mestdagh, Pieter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5414305/
https://www.ncbi.nlm.nih.gov/pubmed/28464823
http://dx.doi.org/10.1186/s12859-017-1651-7
_version_ 1783233347381624832
author Avila Cobos, Francisco
Anckaert, Jasper
Volders, Pieter-Jan
Everaert, Celine
Rombaut, Dries
Vandesompele, Jo
De Preter, Katleen
Mestdagh, Pieter
author_facet Avila Cobos, Francisco
Anckaert, Jasper
Volders, Pieter-Jan
Everaert, Celine
Rombaut, Dries
Vandesompele, Jo
De Preter, Katleen
Mestdagh, Pieter
author_sort Avila Cobos, Francisco
collection PubMed
description BACKGROUND: Reconstructing transcript models from RNA-sequencing (RNA-seq) data and establishing these as independent transcriptional units can be a challenging task. Current state-of-the-art tools for long non-coding RNA (lncRNA) annotation are mainly based on evolutionary constraints, which may result in false negatives due to the overall limited conservation of lncRNAs. RESULTS: To tackle this problem we have developed the Zipper plot, a novel visualization and analysis method that enables users to simultaneously interrogate thousands of human putative transcription start sites (TSSs) in relation to various features that are indicative for transcriptional activity. These include publicly available CAGE-sequencing, ChIP-sequencing and DNase-sequencing datasets. Our method only requires three tab-separated fields (chromosome, genomic coordinate of the TSS and strand) as input and generates a report that includes a detailed summary table, a Zipper plot and several statistics derived from this plot. CONCLUSION: Using the Zipper plot, we found evidence of transcription for a set of well-characterized lncRNAs and observed that fewer mono-exonic lncRNAs have CAGE peaks overlapping with their TSSs compared to multi-exonic lncRNAs. Using publicly available RNA-seq data, we found more than one hundred cases where junction reads connected protein-coding gene exons with a downstream mono-exonic lncRNA, revealing the need for a careful evaluation of lncRNA 5′-boundaries. Our method is implemented using the statistical programming language R and is freely available as a webtool. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1651-7) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5414305
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-54143052017-05-03 Zipper plot: visualizing transcriptional activity of genomic regions Avila Cobos, Francisco Anckaert, Jasper Volders, Pieter-Jan Everaert, Celine Rombaut, Dries Vandesompele, Jo De Preter, Katleen Mestdagh, Pieter BMC Bioinformatics Methodology Article BACKGROUND: Reconstructing transcript models from RNA-sequencing (RNA-seq) data and establishing these as independent transcriptional units can be a challenging task. Current state-of-the-art tools for long non-coding RNA (lncRNA) annotation are mainly based on evolutionary constraints, which may result in false negatives due to the overall limited conservation of lncRNAs. RESULTS: To tackle this problem we have developed the Zipper plot, a novel visualization and analysis method that enables users to simultaneously interrogate thousands of human putative transcription start sites (TSSs) in relation to various features that are indicative for transcriptional activity. These include publicly available CAGE-sequencing, ChIP-sequencing and DNase-sequencing datasets. Our method only requires three tab-separated fields (chromosome, genomic coordinate of the TSS and strand) as input and generates a report that includes a detailed summary table, a Zipper plot and several statistics derived from this plot. CONCLUSION: Using the Zipper plot, we found evidence of transcription for a set of well-characterized lncRNAs and observed that fewer mono-exonic lncRNAs have CAGE peaks overlapping with their TSSs compared to multi-exonic lncRNAs. Using publicly available RNA-seq data, we found more than one hundred cases where junction reads connected protein-coding gene exons with a downstream mono-exonic lncRNA, revealing the need for a careful evaluation of lncRNA 5′-boundaries. Our method is implemented using the statistical programming language R and is freely available as a webtool. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1651-7) contains supplementary material, which is available to authorized users. BioMed Central 2017-05-02 /pmc/articles/PMC5414305/ /pubmed/28464823 http://dx.doi.org/10.1186/s12859-017-1651-7 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Avila Cobos, Francisco
Anckaert, Jasper
Volders, Pieter-Jan
Everaert, Celine
Rombaut, Dries
Vandesompele, Jo
De Preter, Katleen
Mestdagh, Pieter
Zipper plot: visualizing transcriptional activity of genomic regions
title Zipper plot: visualizing transcriptional activity of genomic regions
title_full Zipper plot: visualizing transcriptional activity of genomic regions
title_fullStr Zipper plot: visualizing transcriptional activity of genomic regions
title_full_unstemmed Zipper plot: visualizing transcriptional activity of genomic regions
title_short Zipper plot: visualizing transcriptional activity of genomic regions
title_sort zipper plot: visualizing transcriptional activity of genomic regions
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5414305/
https://www.ncbi.nlm.nih.gov/pubmed/28464823
http://dx.doi.org/10.1186/s12859-017-1651-7
work_keys_str_mv AT avilacobosfrancisco zipperplotvisualizingtranscriptionalactivityofgenomicregions
AT anckaertjasper zipperplotvisualizingtranscriptionalactivityofgenomicregions
AT volderspieterjan zipperplotvisualizingtranscriptionalactivityofgenomicregions
AT everaertceline zipperplotvisualizingtranscriptionalactivityofgenomicregions
AT rombautdries zipperplotvisualizingtranscriptionalactivityofgenomicregions
AT vandesompelejo zipperplotvisualizingtranscriptionalactivityofgenomicregions
AT depreterkatleen zipperplotvisualizingtranscriptionalactivityofgenomicregions
AT mestdaghpieter zipperplotvisualizingtranscriptionalactivityofgenomicregions