Cargando…

Vespucci: a system for building annotated databases of nascent transcripts

Global run-on sequencing (GRO-seq) is a recent addition to the series of high-throughput sequencing methods that enables new insights into transcriptional dynamics within a cell. However, GRO-sequencing presents new algorithmic challenges, as existing analysis platforms for ChIP-seq and RNA-seq do n...

Descripción completa

Detalles Bibliográficos
Autores principales:	Allison, Karmel A., Kaikkonen, Minna U., Gaasterland, Terry, Glass, Christopher K.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2014
Materias:	Genomics
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3936758/ https://www.ncbi.nlm.nih.gov/pubmed/24304890 http://dx.doi.org/10.1093/nar/gkt1237

_version_	1782305359603433472
author	Allison, Karmel A. Kaikkonen, Minna U. Gaasterland, Terry Glass, Christopher K.
author_facet	Allison, Karmel A. Kaikkonen, Minna U. Gaasterland, Terry Glass, Christopher K.
author_sort	Allison, Karmel A.
collection	PubMed
description	Global run-on sequencing (GRO-seq) is a recent addition to the series of high-throughput sequencing methods that enables new insights into transcriptional dynamics within a cell. However, GRO-sequencing presents new algorithmic challenges, as existing analysis platforms for ChIP-seq and RNA-seq do not address the unique problem of identifying transcriptional units de novo from short reads located all across the genome. Here, we present a novel algorithm for de novo transcript identification from GRO-sequencing data, along with a system that determines transcript regions, stores them in a relational database and associates them with known reference annotations. We use this method to analyze GRO-sequencing data from primary mouse macrophages and derive novel quantitative insights into the extent and characteristics of non-coding transcription in mammalian cells. In doing so, we demonstrate that Vespucci expands existing annotations for mRNAs and lincRNAs by defining the primary transcript beyond the polyadenylation site. In addition, Vespucci generates assemblies for un-annotated non-coding RNAs such as those transcribed from enhancer-like elements. Vespucci thereby provides a robust system for defining, storing and analyzing diverse classes of primary RNA transcripts that are of increasing biological interest.
format	Online Article Text
id	pubmed-3936758
institution	National Center for Biotechnology Information
language	English
publishDate	2014
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-39367582014-03-04 Vespucci: a system for building annotated databases of nascent transcripts Allison, Karmel A. Kaikkonen, Minna U. Gaasterland, Terry Glass, Christopher K. Nucleic Acids Res Genomics Global run-on sequencing (GRO-seq) is a recent addition to the series of high-throughput sequencing methods that enables new insights into transcriptional dynamics within a cell. However, GRO-sequencing presents new algorithmic challenges, as existing analysis platforms for ChIP-seq and RNA-seq do not address the unique problem of identifying transcriptional units de novo from short reads located all across the genome. Here, we present a novel algorithm for de novo transcript identification from GRO-sequencing data, along with a system that determines transcript regions, stores them in a relational database and associates them with known reference annotations. We use this method to analyze GRO-sequencing data from primary mouse macrophages and derive novel quantitative insights into the extent and characteristics of non-coding transcription in mammalian cells. In doing so, we demonstrate that Vespucci expands existing annotations for mRNAs and lincRNAs by defining the primary transcript beyond the polyadenylation site. In addition, Vespucci generates assemblies for un-annotated non-coding RNAs such as those transcribed from enhancer-like elements. Vespucci thereby provides a robust system for defining, storing and analyzing diverse classes of primary RNA transcripts that are of increasing biological interest. Oxford University Press 2014-02 2013-12-04 /pmc/articles/PMC3936758/ /pubmed/24304890 http://dx.doi.org/10.1093/nar/gkt1237 Text en © The Author(s) 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Genomics Allison, Karmel A. Kaikkonen, Minna U. Gaasterland, Terry Glass, Christopher K. Vespucci: a system for building annotated databases of nascent transcripts
title	Vespucci: a system for building annotated databases of nascent transcripts
title_full	Vespucci: a system for building annotated databases of nascent transcripts
title_fullStr	Vespucci: a system for building annotated databases of nascent transcripts
title_full_unstemmed	Vespucci: a system for building annotated databases of nascent transcripts
title_short	Vespucci: a system for building annotated databases of nascent transcripts
title_sort	vespucci: a system for building annotated databases of nascent transcripts
topic	Genomics
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3936758/ https://www.ncbi.nlm.nih.gov/pubmed/24304890 http://dx.doi.org/10.1093/nar/gkt1237
work_keys_str_mv	AT allisonkarmela vespucciasystemforbuildingannotateddatabasesofnascenttranscripts AT kaikkonenminnau vespucciasystemforbuildingannotateddatabasesofnascenttranscripts AT gaasterlandterry vespucciasystemforbuildingannotateddatabasesofnascenttranscripts AT glasschristopherk vespucciasystemforbuildingannotateddatabasesofnascenttranscripts

Vespucci: a system for building annotated databases of nascent transcripts

Ejemplares similares