Cargando…
Zipper plot: visualizing transcriptional activity of genomic regions
BACKGROUND: Reconstructing transcript models from RNA-sequencing (RNA-seq) data and establishing these as independent transcriptional units can be a challenging task. Current state-of-the-art tools for long non-coding RNA (lncRNA) annotation are mainly based on evolutionary constraints, which may re...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5414305/ https://www.ncbi.nlm.nih.gov/pubmed/28464823 http://dx.doi.org/10.1186/s12859-017-1651-7 |
_version_ | 1783233347381624832 |
---|---|
author | Avila Cobos, Francisco Anckaert, Jasper Volders, Pieter-Jan Everaert, Celine Rombaut, Dries Vandesompele, Jo De Preter, Katleen Mestdagh, Pieter |
author_facet | Avila Cobos, Francisco Anckaert, Jasper Volders, Pieter-Jan Everaert, Celine Rombaut, Dries Vandesompele, Jo De Preter, Katleen Mestdagh, Pieter |
author_sort | Avila Cobos, Francisco |
collection | PubMed |
description | BACKGROUND: Reconstructing transcript models from RNA-sequencing (RNA-seq) data and establishing these as independent transcriptional units can be a challenging task. Current state-of-the-art tools for long non-coding RNA (lncRNA) annotation are mainly based on evolutionary constraints, which may result in false negatives due to the overall limited conservation of lncRNAs. RESULTS: To tackle this problem we have developed the Zipper plot, a novel visualization and analysis method that enables users to simultaneously interrogate thousands of human putative transcription start sites (TSSs) in relation to various features that are indicative for transcriptional activity. These include publicly available CAGE-sequencing, ChIP-sequencing and DNase-sequencing datasets. Our method only requires three tab-separated fields (chromosome, genomic coordinate of the TSS and strand) as input and generates a report that includes a detailed summary table, a Zipper plot and several statistics derived from this plot. CONCLUSION: Using the Zipper plot, we found evidence of transcription for a set of well-characterized lncRNAs and observed that fewer mono-exonic lncRNAs have CAGE peaks overlapping with their TSSs compared to multi-exonic lncRNAs. Using publicly available RNA-seq data, we found more than one hundred cases where junction reads connected protein-coding gene exons with a downstream mono-exonic lncRNA, revealing the need for a careful evaluation of lncRNA 5′-boundaries. Our method is implemented using the statistical programming language R and is freely available as a webtool. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1651-7) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5414305 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-54143052017-05-03 Zipper plot: visualizing transcriptional activity of genomic regions Avila Cobos, Francisco Anckaert, Jasper Volders, Pieter-Jan Everaert, Celine Rombaut, Dries Vandesompele, Jo De Preter, Katleen Mestdagh, Pieter BMC Bioinformatics Methodology Article BACKGROUND: Reconstructing transcript models from RNA-sequencing (RNA-seq) data and establishing these as independent transcriptional units can be a challenging task. Current state-of-the-art tools for long non-coding RNA (lncRNA) annotation are mainly based on evolutionary constraints, which may result in false negatives due to the overall limited conservation of lncRNAs. RESULTS: To tackle this problem we have developed the Zipper plot, a novel visualization and analysis method that enables users to simultaneously interrogate thousands of human putative transcription start sites (TSSs) in relation to various features that are indicative for transcriptional activity. These include publicly available CAGE-sequencing, ChIP-sequencing and DNase-sequencing datasets. Our method only requires three tab-separated fields (chromosome, genomic coordinate of the TSS and strand) as input and generates a report that includes a detailed summary table, a Zipper plot and several statistics derived from this plot. CONCLUSION: Using the Zipper plot, we found evidence of transcription for a set of well-characterized lncRNAs and observed that fewer mono-exonic lncRNAs have CAGE peaks overlapping with their TSSs compared to multi-exonic lncRNAs. Using publicly available RNA-seq data, we found more than one hundred cases where junction reads connected protein-coding gene exons with a downstream mono-exonic lncRNA, revealing the need for a careful evaluation of lncRNA 5′-boundaries. Our method is implemented using the statistical programming language R and is freely available as a webtool. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1651-7) contains supplementary material, which is available to authorized users. BioMed Central 2017-05-02 /pmc/articles/PMC5414305/ /pubmed/28464823 http://dx.doi.org/10.1186/s12859-017-1651-7 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Avila Cobos, Francisco Anckaert, Jasper Volders, Pieter-Jan Everaert, Celine Rombaut, Dries Vandesompele, Jo De Preter, Katleen Mestdagh, Pieter Zipper plot: visualizing transcriptional activity of genomic regions |
title | Zipper plot: visualizing transcriptional activity of genomic regions |
title_full | Zipper plot: visualizing transcriptional activity of genomic regions |
title_fullStr | Zipper plot: visualizing transcriptional activity of genomic regions |
title_full_unstemmed | Zipper plot: visualizing transcriptional activity of genomic regions |
title_short | Zipper plot: visualizing transcriptional activity of genomic regions |
title_sort | zipper plot: visualizing transcriptional activity of genomic regions |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5414305/ https://www.ncbi.nlm.nih.gov/pubmed/28464823 http://dx.doi.org/10.1186/s12859-017-1651-7 |
work_keys_str_mv | AT avilacobosfrancisco zipperplotvisualizingtranscriptionalactivityofgenomicregions AT anckaertjasper zipperplotvisualizingtranscriptionalactivityofgenomicregions AT volderspieterjan zipperplotvisualizingtranscriptionalactivityofgenomicregions AT everaertceline zipperplotvisualizingtranscriptionalactivityofgenomicregions AT rombautdries zipperplotvisualizingtranscriptionalactivityofgenomicregions AT vandesompelejo zipperplotvisualizingtranscriptionalactivityofgenomicregions AT depreterkatleen zipperplotvisualizingtranscriptionalactivityofgenomicregions AT mestdaghpieter zipperplotvisualizingtranscriptionalactivityofgenomicregions |