Cargando…

De novo assembly and characterizing of the culm-derived meta-transcriptome from the polyploid sugarcane genome based on coding transcripts

Sugarcane biomass has been used for sugar, bioenergy and biomaterial production. The majority of the sugarcane biomass comes from the culm, which makes it important to understand the genetic control of biomass production in this part of the plant. A meta-transcriptome of the culm was obtained in an...

Descripción completa

Detalles Bibliográficos
Autores principales: Hoang, Nam V., Furtado, Agnelo, Thirugnanasambandam, Prathima P., Botha, Frederik C., Henry, Robert J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5968133/
https://www.ncbi.nlm.nih.gov/pubmed/29862346
http://dx.doi.org/10.1016/j.heliyon.2018.e00583
_version_ 1783325711225847808
author Hoang, Nam V.
Furtado, Agnelo
Thirugnanasambandam, Prathima P.
Botha, Frederik C.
Henry, Robert J.
author_facet Hoang, Nam V.
Furtado, Agnelo
Thirugnanasambandam, Prathima P.
Botha, Frederik C.
Henry, Robert J.
author_sort Hoang, Nam V.
collection PubMed
description Sugarcane biomass has been used for sugar, bioenergy and biomaterial production. The majority of the sugarcane biomass comes from the culm, which makes it important to understand the genetic control of biomass production in this part of the plant. A meta-transcriptome of the culm was obtained in an earlier study by using about one billion paired-end (150 bp) reads of deep RNA sequencing of samples from 20 diverse sugarcane genotypes and combining de novo assemblies from different assemblers and different settings. Although many genes could be recovered, this resulted in a large combined assembly which created the need for clustering to reduce transcript redundancy while maintaining gene content. Here, we present a comprehensive analysis of the effect of different assembly settings and clustering methods on de novo assembly, annotation and transcript profiling focusing especially on the coding transcripts from the highly polyploid sugarcane genome. The new coding sequence-based transcript clustering resulted in a better representation of transcripts compared to the earlier approach, having 121,987 contigs, which included 78,052 main and 43,935 alternative transcripts. About 73%, 67%, 61% and 10% of the transcriptome was annotated against the NCBI NR protein database, GO terms, orthologous groups and KEGG orthologies, respectively. Using this set for a differential gene expression analysis between the young and mature sugarcane culm tissues, a total of 822 transcripts were found to be differentially expressed, including key transcripts involved in sugar/fiber accumulation in sugarcane. In the context of the lack of a whole genome sequence for sugarcane, the availability of a well annotated culm-derived meta-transcriptome through deep sequencing provides useful information on coding genes specific to the sugarcane culm and will certainly contribute to understanding the process of carbon partitioning, and biomass accumulation in the sugarcane culm.
format Online
Article
Text
id pubmed-5968133
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-59681332018-06-01 De novo assembly and characterizing of the culm-derived meta-transcriptome from the polyploid sugarcane genome based on coding transcripts Hoang, Nam V. Furtado, Agnelo Thirugnanasambandam, Prathima P. Botha, Frederik C. Henry, Robert J. Heliyon Article Sugarcane biomass has been used for sugar, bioenergy and biomaterial production. The majority of the sugarcane biomass comes from the culm, which makes it important to understand the genetic control of biomass production in this part of the plant. A meta-transcriptome of the culm was obtained in an earlier study by using about one billion paired-end (150 bp) reads of deep RNA sequencing of samples from 20 diverse sugarcane genotypes and combining de novo assemblies from different assemblers and different settings. Although many genes could be recovered, this resulted in a large combined assembly which created the need for clustering to reduce transcript redundancy while maintaining gene content. Here, we present a comprehensive analysis of the effect of different assembly settings and clustering methods on de novo assembly, annotation and transcript profiling focusing especially on the coding transcripts from the highly polyploid sugarcane genome. The new coding sequence-based transcript clustering resulted in a better representation of transcripts compared to the earlier approach, having 121,987 contigs, which included 78,052 main and 43,935 alternative transcripts. About 73%, 67%, 61% and 10% of the transcriptome was annotated against the NCBI NR protein database, GO terms, orthologous groups and KEGG orthologies, respectively. Using this set for a differential gene expression analysis between the young and mature sugarcane culm tissues, a total of 822 transcripts were found to be differentially expressed, including key transcripts involved in sugar/fiber accumulation in sugarcane. In the context of the lack of a whole genome sequence for sugarcane, the availability of a well annotated culm-derived meta-transcriptome through deep sequencing provides useful information on coding genes specific to the sugarcane culm and will certainly contribute to understanding the process of carbon partitioning, and biomass accumulation in the sugarcane culm. Elsevier 2018-03-22 /pmc/articles/PMC5968133/ /pubmed/29862346 http://dx.doi.org/10.1016/j.heliyon.2018.e00583 Text en © 2018 The Authors. Published by Elsevier Ltd. http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Hoang, Nam V.
Furtado, Agnelo
Thirugnanasambandam, Prathima P.
Botha, Frederik C.
Henry, Robert J.
De novo assembly and characterizing of the culm-derived meta-transcriptome from the polyploid sugarcane genome based on coding transcripts
title De novo assembly and characterizing of the culm-derived meta-transcriptome from the polyploid sugarcane genome based on coding transcripts
title_full De novo assembly and characterizing of the culm-derived meta-transcriptome from the polyploid sugarcane genome based on coding transcripts
title_fullStr De novo assembly and characterizing of the culm-derived meta-transcriptome from the polyploid sugarcane genome based on coding transcripts
title_full_unstemmed De novo assembly and characterizing of the culm-derived meta-transcriptome from the polyploid sugarcane genome based on coding transcripts
title_short De novo assembly and characterizing of the culm-derived meta-transcriptome from the polyploid sugarcane genome based on coding transcripts
title_sort de novo assembly and characterizing of the culm-derived meta-transcriptome from the polyploid sugarcane genome based on coding transcripts
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5968133/
https://www.ncbi.nlm.nih.gov/pubmed/29862346
http://dx.doi.org/10.1016/j.heliyon.2018.e00583
work_keys_str_mv AT hoangnamv denovoassemblyandcharacterizingoftheculmderivedmetatranscriptomefromthepolyploidsugarcanegenomebasedoncodingtranscripts
AT furtadoagnelo denovoassemblyandcharacterizingoftheculmderivedmetatranscriptomefromthepolyploidsugarcanegenomebasedoncodingtranscripts
AT thirugnanasambandamprathimap denovoassemblyandcharacterizingoftheculmderivedmetatranscriptomefromthepolyploidsugarcanegenomebasedoncodingtranscripts
AT bothafrederikc denovoassemblyandcharacterizingoftheculmderivedmetatranscriptomefromthepolyploidsugarcanegenomebasedoncodingtranscripts
AT henryrobertj denovoassemblyandcharacterizingoftheculmderivedmetatranscriptomefromthepolyploidsugarcanegenomebasedoncodingtranscripts