Cargando…

The full-length transcriptome of C. elegans using direct RNA sequencing

Current transcriptome annotations have largely relied on short read lengths intrinsic to the most widely used high-throughput cDNA sequencing technologies. For example, in the annotation of the Caenorhabditis elegans transcriptome, more than half of the transcript isoforms lack full-length support a...

Descripción completa

Detalles Bibliográficos
Autores principales: Roach, Nathan P., Sadowski, Norah, Alessi, Amelia F., Timp, Winston, Taylor, James, Kim, John K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7050520/
https://www.ncbi.nlm.nih.gov/pubmed/32024661
http://dx.doi.org/10.1101/gr.251314.119
_version_ 1783502625461764096
author Roach, Nathan P.
Sadowski, Norah
Alessi, Amelia F.
Timp, Winston
Taylor, James
Kim, John K.
author_facet Roach, Nathan P.
Sadowski, Norah
Alessi, Amelia F.
Timp, Winston
Taylor, James
Kim, John K.
author_sort Roach, Nathan P.
collection PubMed
description Current transcriptome annotations have largely relied on short read lengths intrinsic to the most widely used high-throughput cDNA sequencing technologies. For example, in the annotation of the Caenorhabditis elegans transcriptome, more than half of the transcript isoforms lack full-length support and instead rely on inference from short reads that do not span the full length of the isoform. We applied nanopore-based direct RNA sequencing to characterize the developmental polyadenylated transcriptome of C. elegans. Taking advantage of long reads spanning the full length of mRNA transcripts, we provide support for 23,865 splice isoforms across 14,611 genes, without the need for computational reconstruction of gene models. Of the isoforms identified, 3452 are novel splice isoforms not present in the WormBase WS265 annotation. Furthermore, we identified 16,342 isoforms in the 3′ untranslated region (3′ UTR), 2640 of which are novel and do not fall within 10 bp of existing 3′-UTR data sets and annotations. Combining 3′ UTRs and splice isoforms, we identified 28,858 full-length transcript isoforms. We also determined that poly(A) tail lengths of transcripts vary across development, as do the strengths of previously reported correlations between poly(A) tail length and expression level, and poly(A) tail length and 3′-UTR length. Finally, we have formatted this data as a publicly accessible track hub, enabling researchers to explore this data set easily in a genome browser.
format Online
Article
Text
id pubmed-7050520
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-70505202020-03-16 The full-length transcriptome of C. elegans using direct RNA sequencing Roach, Nathan P. Sadowski, Norah Alessi, Amelia F. Timp, Winston Taylor, James Kim, John K. Genome Res Resource Current transcriptome annotations have largely relied on short read lengths intrinsic to the most widely used high-throughput cDNA sequencing technologies. For example, in the annotation of the Caenorhabditis elegans transcriptome, more than half of the transcript isoforms lack full-length support and instead rely on inference from short reads that do not span the full length of the isoform. We applied nanopore-based direct RNA sequencing to characterize the developmental polyadenylated transcriptome of C. elegans. Taking advantage of long reads spanning the full length of mRNA transcripts, we provide support for 23,865 splice isoforms across 14,611 genes, without the need for computational reconstruction of gene models. Of the isoforms identified, 3452 are novel splice isoforms not present in the WormBase WS265 annotation. Furthermore, we identified 16,342 isoforms in the 3′ untranslated region (3′ UTR), 2640 of which are novel and do not fall within 10 bp of existing 3′-UTR data sets and annotations. Combining 3′ UTRs and splice isoforms, we identified 28,858 full-length transcript isoforms. We also determined that poly(A) tail lengths of transcripts vary across development, as do the strengths of previously reported correlations between poly(A) tail length and expression level, and poly(A) tail length and 3′-UTR length. Finally, we have formatted this data as a publicly accessible track hub, enabling researchers to explore this data set easily in a genome browser. Cold Spring Harbor Laboratory Press 2020-02 /pmc/articles/PMC7050520/ /pubmed/32024661 http://dx.doi.org/10.1101/gr.251314.119 Text en © 2020 Roach et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by/4.0/ This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.
spellingShingle Resource
Roach, Nathan P.
Sadowski, Norah
Alessi, Amelia F.
Timp, Winston
Taylor, James
Kim, John K.
The full-length transcriptome of C. elegans using direct RNA sequencing
title The full-length transcriptome of C. elegans using direct RNA sequencing
title_full The full-length transcriptome of C. elegans using direct RNA sequencing
title_fullStr The full-length transcriptome of C. elegans using direct RNA sequencing
title_full_unstemmed The full-length transcriptome of C. elegans using direct RNA sequencing
title_short The full-length transcriptome of C. elegans using direct RNA sequencing
title_sort full-length transcriptome of c. elegans using direct rna sequencing
topic Resource
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7050520/
https://www.ncbi.nlm.nih.gov/pubmed/32024661
http://dx.doi.org/10.1101/gr.251314.119
work_keys_str_mv AT roachnathanp thefulllengthtranscriptomeofcelegansusingdirectrnasequencing
AT sadowskinorah thefulllengthtranscriptomeofcelegansusingdirectrnasequencing
AT alessiameliaf thefulllengthtranscriptomeofcelegansusingdirectrnasequencing
AT timpwinston thefulllengthtranscriptomeofcelegansusingdirectrnasequencing
AT taylorjames thefulllengthtranscriptomeofcelegansusingdirectrnasequencing
AT kimjohnk thefulllengthtranscriptomeofcelegansusingdirectrnasequencing
AT roachnathanp fulllengthtranscriptomeofcelegansusingdirectrnasequencing
AT sadowskinorah fulllengthtranscriptomeofcelegansusingdirectrnasequencing
AT alessiameliaf fulllengthtranscriptomeofcelegansusingdirectrnasequencing
AT timpwinston fulllengthtranscriptomeofcelegansusingdirectrnasequencing
AT taylorjames fulllengthtranscriptomeofcelegansusingdirectrnasequencing
AT kimjohnk fulllengthtranscriptomeofcelegansusingdirectrnasequencing