Cargando…

The Impact of cDNA Normalization on Long-Read Sequencing of a Complex Transcriptome

Normalization of cDNA is widely used to improve the coverage of rare transcripts in analysis of transcriptomes employing next-generation sequencing. Recently, long-read technology has been emerging as a powerful tool for sequencing and construction of transcriptomes, especially for complex genomes c...

Descripción completa

Detalles Bibliográficos
Autores principales:	Hoang, Nam V., Furtado, Agnelo, Perlo, Virginie, Botha, Frederik C., Henry, Robert J.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2019
Materias:	Genetics
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6664245/ https://www.ncbi.nlm.nih.gov/pubmed/31396260 http://dx.doi.org/10.3389/fgene.2019.00654

_version_	1783439859889733632
author	Hoang, Nam V. Furtado, Agnelo Perlo, Virginie Botha, Frederik C. Henry, Robert J.
author_facet	Hoang, Nam V. Furtado, Agnelo Perlo, Virginie Botha, Frederik C. Henry, Robert J.
author_sort	Hoang, Nam V.
collection	PubMed
description	Normalization of cDNA is widely used to improve the coverage of rare transcripts in analysis of transcriptomes employing next-generation sequencing. Recently, long-read technology has been emerging as a powerful tool for sequencing and construction of transcriptomes, especially for complex genomes containing highly similar transcripts and transcript-spliced isoforms. Here, we analyzed the transcriptome of sugarcane, a highly polyploidy plant genome, by PacBio isoform sequencing (Iso-Seq) of two different cDNA library preparations, with and without a normalization step. The results demonstrated that, while the two libraries included many of the same transcripts, many longer transcripts were removed, and many new generally shorter transcripts were detected by normalization. For the same input cDNA and data yield, the normalized library recovered more total transcript isoforms and number of predicted gene families and orthologous groups, resulting in a higher representation for the sugarcane transcriptome, compared to the non-normalized library. The non-normalized library, on the other hand, included a wider transcript length range with more longer transcripts above ∼1.25 kb and more transcript isoforms per gene family and gene ontology terms per transcript. A large proportion of the unique transcripts comprising ∼52% of the normalized library were expressed at a lower level than the unique transcripts from the non-normalized library, across three tissue types tested including leaf, stalk, and root. About 83% of the total 5,348 predicted long noncoding transcripts was derived from the normalized library, of which ∼80% was derived from the lowly expressed fraction. Functional annotation of the unique transcripts suggested that each library enriched different functional transcript fractions. This demonstrated the complementation of the two approaches in obtaining a complete transcriptome of a complex genome at the sequencing depth used in this study.
format	Online Article Text
id	pubmed-6664245
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-66642452019-08-08 The Impact of cDNA Normalization on Long-Read Sequencing of a Complex Transcriptome Hoang, Nam V. Furtado, Agnelo Perlo, Virginie Botha, Frederik C. Henry, Robert J. Front Genet Genetics Normalization of cDNA is widely used to improve the coverage of rare transcripts in analysis of transcriptomes employing next-generation sequencing. Recently, long-read technology has been emerging as a powerful tool for sequencing and construction of transcriptomes, especially for complex genomes containing highly similar transcripts and transcript-spliced isoforms. Here, we analyzed the transcriptome of sugarcane, a highly polyploidy plant genome, by PacBio isoform sequencing (Iso-Seq) of two different cDNA library preparations, with and without a normalization step. The results demonstrated that, while the two libraries included many of the same transcripts, many longer transcripts were removed, and many new generally shorter transcripts were detected by normalization. For the same input cDNA and data yield, the normalized library recovered more total transcript isoforms and number of predicted gene families and orthologous groups, resulting in a higher representation for the sugarcane transcriptome, compared to the non-normalized library. The non-normalized library, on the other hand, included a wider transcript length range with more longer transcripts above ∼1.25 kb and more transcript isoforms per gene family and gene ontology terms per transcript. A large proportion of the unique transcripts comprising ∼52% of the normalized library were expressed at a lower level than the unique transcripts from the non-normalized library, across three tissue types tested including leaf, stalk, and root. About 83% of the total 5,348 predicted long noncoding transcripts was derived from the normalized library, of which ∼80% was derived from the lowly expressed fraction. Functional annotation of the unique transcripts suggested that each library enriched different functional transcript fractions. This demonstrated the complementation of the two approaches in obtaining a complete transcriptome of a complex genome at the sequencing depth used in this study. Frontiers Media S.A. 2019-07-23 /pmc/articles/PMC6664245/ /pubmed/31396260 http://dx.doi.org/10.3389/fgene.2019.00654 Text en Copyright © 2019 Hoang, Furtado, Perlo, Botha and Henry http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Genetics Hoang, Nam V. Furtado, Agnelo Perlo, Virginie Botha, Frederik C. Henry, Robert J. The Impact of cDNA Normalization on Long-Read Sequencing of a Complex Transcriptome
title	The Impact of cDNA Normalization on Long-Read Sequencing of a Complex Transcriptome
title_full	The Impact of cDNA Normalization on Long-Read Sequencing of a Complex Transcriptome
title_fullStr	The Impact of cDNA Normalization on Long-Read Sequencing of a Complex Transcriptome
title_full_unstemmed	The Impact of cDNA Normalization on Long-Read Sequencing of a Complex Transcriptome
title_short	The Impact of cDNA Normalization on Long-Read Sequencing of a Complex Transcriptome
title_sort	impact of cdna normalization on long-read sequencing of a complex transcriptome
topic	Genetics
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6664245/ https://www.ncbi.nlm.nih.gov/pubmed/31396260 http://dx.doi.org/10.3389/fgene.2019.00654
work_keys_str_mv	AT hoangnamv theimpactofcdnanormalizationonlongreadsequencingofacomplextranscriptome AT furtadoagnelo theimpactofcdnanormalizationonlongreadsequencingofacomplextranscriptome AT perlovirginie theimpactofcdnanormalizationonlongreadsequencingofacomplextranscriptome AT bothafrederikc theimpactofcdnanormalizationonlongreadsequencingofacomplextranscriptome AT henryrobertj theimpactofcdnanormalizationonlongreadsequencingofacomplextranscriptome AT hoangnamv impactofcdnanormalizationonlongreadsequencingofacomplextranscriptome AT furtadoagnelo impactofcdnanormalizationonlongreadsequencingofacomplextranscriptome AT perlovirginie impactofcdnanormalizationonlongreadsequencingofacomplextranscriptome AT bothafrederikc impactofcdnanormalizationonlongreadsequencingofacomplextranscriptome AT henryrobertj impactofcdnanormalizationonlongreadsequencingofacomplextranscriptome

The Impact of cDNA Normalization on Long-Read Sequencing of a Complex Transcriptome

Ejemplares similares