Cargando…

A long-read RNA-seq approach to identify novel transcripts of very large genes

RNA-seq is widely used for studying gene expression, but commonly used sequencing platforms produce short reads that only span up to two exon junctions per read. This makes it difficult to accurately determine the composition and phasing of exons within transcripts. Although long-read sequencing imp...

Descripción completa

Detalles Bibliográficos
Autores principales: Uapinyoying, Prech, Goecks, Jeremy, Knoblach, Susan M., Panchapakesan, Karuna, Bonnemann, Carsten G., Partridge, Terence A., Jaiswal, Jyoti K., Hoffman, Eric P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7370890/
https://www.ncbi.nlm.nih.gov/pubmed/32660935
http://dx.doi.org/10.1101/gr.259903.119
_version_ 1783561053263626240
author Uapinyoying, Prech
Goecks, Jeremy
Knoblach, Susan M.
Panchapakesan, Karuna
Bonnemann, Carsten G.
Partridge, Terence A.
Jaiswal, Jyoti K.
Hoffman, Eric P.
author_facet Uapinyoying, Prech
Goecks, Jeremy
Knoblach, Susan M.
Panchapakesan, Karuna
Bonnemann, Carsten G.
Partridge, Terence A.
Jaiswal, Jyoti K.
Hoffman, Eric P.
author_sort Uapinyoying, Prech
collection PubMed
description RNA-seq is widely used for studying gene expression, but commonly used sequencing platforms produce short reads that only span up to two exon junctions per read. This makes it difficult to accurately determine the composition and phasing of exons within transcripts. Although long-read sequencing improves this issue, it is not amenable to precise quantitation, which limits its utility for differential expression studies. We used long-read isoform sequencing combined with a novel analysis approach to compare alternative splicing of large, repetitive structural genes in muscles. Analysis of muscle structural genes that produce medium (Nrap: 5 kb), large (Neb: 22 kb), and very large (Ttn: 106 kb) transcripts in cardiac muscle, and fast and slow skeletal muscles identified unannotated exons for each of these ubiquitous muscle genes. This also identified differential exon usage and phasing for these genes between the different muscle types. By mapping the in-phase transcript structures to known annotations, we also identified and quantified previously unannotated transcripts. Results were confirmed by endpoint PCR and Sanger sequencing, which revealed muscle-type-specific differential expression of these novel transcripts. The improved transcript identification and quantification shown by our approach removes previous impediments to studies aimed at quantitative differential expression of ultralong transcripts.
format Online
Article
Text
id pubmed-7370890
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-73708902020-07-24 A long-read RNA-seq approach to identify novel transcripts of very large genes Uapinyoying, Prech Goecks, Jeremy Knoblach, Susan M. Panchapakesan, Karuna Bonnemann, Carsten G. Partridge, Terence A. Jaiswal, Jyoti K. Hoffman, Eric P. Genome Res Method RNA-seq is widely used for studying gene expression, but commonly used sequencing platforms produce short reads that only span up to two exon junctions per read. This makes it difficult to accurately determine the composition and phasing of exons within transcripts. Although long-read sequencing improves this issue, it is not amenable to precise quantitation, which limits its utility for differential expression studies. We used long-read isoform sequencing combined with a novel analysis approach to compare alternative splicing of large, repetitive structural genes in muscles. Analysis of muscle structural genes that produce medium (Nrap: 5 kb), large (Neb: 22 kb), and very large (Ttn: 106 kb) transcripts in cardiac muscle, and fast and slow skeletal muscles identified unannotated exons for each of these ubiquitous muscle genes. This also identified differential exon usage and phasing for these genes between the different muscle types. By mapping the in-phase transcript structures to known annotations, we also identified and quantified previously unannotated transcripts. Results were confirmed by endpoint PCR and Sanger sequencing, which revealed muscle-type-specific differential expression of these novel transcripts. The improved transcript identification and quantification shown by our approach removes previous impediments to studies aimed at quantitative differential expression of ultralong transcripts. Cold Spring Harbor Laboratory Press 2020-06 /pmc/articles/PMC7370890/ /pubmed/32660935 http://dx.doi.org/10.1101/gr.259903.119 Text en © 2020 Uapinyoying et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by/4.0/ This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.
spellingShingle Method
Uapinyoying, Prech
Goecks, Jeremy
Knoblach, Susan M.
Panchapakesan, Karuna
Bonnemann, Carsten G.
Partridge, Terence A.
Jaiswal, Jyoti K.
Hoffman, Eric P.
A long-read RNA-seq approach to identify novel transcripts of very large genes
title A long-read RNA-seq approach to identify novel transcripts of very large genes
title_full A long-read RNA-seq approach to identify novel transcripts of very large genes
title_fullStr A long-read RNA-seq approach to identify novel transcripts of very large genes
title_full_unstemmed A long-read RNA-seq approach to identify novel transcripts of very large genes
title_short A long-read RNA-seq approach to identify novel transcripts of very large genes
title_sort long-read rna-seq approach to identify novel transcripts of very large genes
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7370890/
https://www.ncbi.nlm.nih.gov/pubmed/32660935
http://dx.doi.org/10.1101/gr.259903.119
work_keys_str_mv AT uapinyoyingprech alongreadrnaseqapproachtoidentifynoveltranscriptsofverylargegenes
AT goecksjeremy alongreadrnaseqapproachtoidentifynoveltranscriptsofverylargegenes
AT knoblachsusanm alongreadrnaseqapproachtoidentifynoveltranscriptsofverylargegenes
AT panchapakesankaruna alongreadrnaseqapproachtoidentifynoveltranscriptsofverylargegenes
AT bonnemanncarsteng alongreadrnaseqapproachtoidentifynoveltranscriptsofverylargegenes
AT partridgeterencea alongreadrnaseqapproachtoidentifynoveltranscriptsofverylargegenes
AT jaiswaljyotik alongreadrnaseqapproachtoidentifynoveltranscriptsofverylargegenes
AT hoffmanericp alongreadrnaseqapproachtoidentifynoveltranscriptsofverylargegenes
AT uapinyoyingprech longreadrnaseqapproachtoidentifynoveltranscriptsofverylargegenes
AT goecksjeremy longreadrnaseqapproachtoidentifynoveltranscriptsofverylargegenes
AT knoblachsusanm longreadrnaseqapproachtoidentifynoveltranscriptsofverylargegenes
AT panchapakesankaruna longreadrnaseqapproachtoidentifynoveltranscriptsofverylargegenes
AT bonnemanncarsteng longreadrnaseqapproachtoidentifynoveltranscriptsofverylargegenes
AT partridgeterencea longreadrnaseqapproachtoidentifynoveltranscriptsofverylargegenes
AT jaiswaljyotik longreadrnaseqapproachtoidentifynoveltranscriptsofverylargegenes
AT hoffmanericp longreadrnaseqapproachtoidentifynoveltranscriptsofverylargegenes