Cargando…

Unveiling the transcriptomic complexity of Miscanthus sinensis using a combination of PacBio long read- and Illumina short read sequencing platforms

BACKGROUND: Miscanthus sinensis Andersson is a perennial grass that exhibits remarkable lignocellulose characteristics suitable for sustainable bioenergy production. However, knowledge of the genetic resources of this species is relatively limited, which considerably hampers further work on its biol...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Yongli, Li, Xia, Wang, Congsheng, Gao, Lu, Wu, Yanfang, Ni, Xingnan, Sun, Jianzhong, Jiang, Jianxiong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8459517/
https://www.ncbi.nlm.nih.gov/pubmed/34551715
http://dx.doi.org/10.1186/s12864-021-07971-x
_version_ 1784571539653394432
author Wang, Yongli
Li, Xia
Wang, Congsheng
Gao, Lu
Wu, Yanfang
Ni, Xingnan
Sun, Jianzhong
Jiang, Jianxiong
author_facet Wang, Yongli
Li, Xia
Wang, Congsheng
Gao, Lu
Wu, Yanfang
Ni, Xingnan
Sun, Jianzhong
Jiang, Jianxiong
author_sort Wang, Yongli
collection PubMed
description BACKGROUND: Miscanthus sinensis Andersson is a perennial grass that exhibits remarkable lignocellulose characteristics suitable for sustainable bioenergy production. However, knowledge of the genetic resources of this species is relatively limited, which considerably hampers further work on its biology and genetic improvement. RESULTS: In this study, through analyzing the transcriptome of mixed samples of leaves and stems using the latest PacBio Iso-Seq sequencing technology combined with Illumina HiSeq, we report the first full-length transcriptome dataset of M. sinensis with a total of 58.21 Gb clean data. An average of 15.75 Gb clean reads of each sample were obtained from the PacBio Iso-Seq system, which doubled the data size (6.68 Gb) obtained from the Illumina HiSeq platform. The integrated analyses of PacBio- and Illumina-based transcriptomic data uncovered 408,801 non-redundant transcripts with an average length of 1,685 bp. Of those, 189,406 transcripts were commonly identified by both methods, 169,149 transcripts with an average length of 619 bp were uniquely identified by Illumina HiSeq, and 51,246 transcripts with an average length of 2,535 bp were uniquely identified by PacBio Iso-Seq. Approximately 96 % of the final combined transcripts were mapped back to the Miscanthus genome, reflecting the high quality and coverage of our sequencing results. When comparing our data with genomes of four species of Andropogoneae, M. sinensis showed the closest relationship with sugarcane with up to 93 % mapping ratios, followed by sorghum with up to 80 % mapping ratios, indicating a high conservation of orthologs in these three genomes. Furthermore, 306,228 transcripts were successfully annotated against public databases including cell wall related genes and transcript factor families, thus providing many new insights into gene functions. The PacBio Iso-Seq data also helped identify 3,898 alternative splicing events and 2,963 annotated AS isoforms within 10 function categories. CONCLUSIONS: Taken together, the present study provides a rich data set of full-length transcripts that greatly enriches our understanding of M. sinensis transcriptomic resources, thus facilitating further genetic improvement and molecular studies of the Miscanthus species. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-07971-x.
format Online
Article
Text
id pubmed-8459517
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-84595172021-09-23 Unveiling the transcriptomic complexity of Miscanthus sinensis using a combination of PacBio long read- and Illumina short read sequencing platforms Wang, Yongli Li, Xia Wang, Congsheng Gao, Lu Wu, Yanfang Ni, Xingnan Sun, Jianzhong Jiang, Jianxiong BMC Genomics Research Article BACKGROUND: Miscanthus sinensis Andersson is a perennial grass that exhibits remarkable lignocellulose characteristics suitable for sustainable bioenergy production. However, knowledge of the genetic resources of this species is relatively limited, which considerably hampers further work on its biology and genetic improvement. RESULTS: In this study, through analyzing the transcriptome of mixed samples of leaves and stems using the latest PacBio Iso-Seq sequencing technology combined with Illumina HiSeq, we report the first full-length transcriptome dataset of M. sinensis with a total of 58.21 Gb clean data. An average of 15.75 Gb clean reads of each sample were obtained from the PacBio Iso-Seq system, which doubled the data size (6.68 Gb) obtained from the Illumina HiSeq platform. The integrated analyses of PacBio- and Illumina-based transcriptomic data uncovered 408,801 non-redundant transcripts with an average length of 1,685 bp. Of those, 189,406 transcripts were commonly identified by both methods, 169,149 transcripts with an average length of 619 bp were uniquely identified by Illumina HiSeq, and 51,246 transcripts with an average length of 2,535 bp were uniquely identified by PacBio Iso-Seq. Approximately 96 % of the final combined transcripts were mapped back to the Miscanthus genome, reflecting the high quality and coverage of our sequencing results. When comparing our data with genomes of four species of Andropogoneae, M. sinensis showed the closest relationship with sugarcane with up to 93 % mapping ratios, followed by sorghum with up to 80 % mapping ratios, indicating a high conservation of orthologs in these three genomes. Furthermore, 306,228 transcripts were successfully annotated against public databases including cell wall related genes and transcript factor families, thus providing many new insights into gene functions. The PacBio Iso-Seq data also helped identify 3,898 alternative splicing events and 2,963 annotated AS isoforms within 10 function categories. CONCLUSIONS: Taken together, the present study provides a rich data set of full-length transcripts that greatly enriches our understanding of M. sinensis transcriptomic resources, thus facilitating further genetic improvement and molecular studies of the Miscanthus species. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-07971-x. BioMed Central 2021-09-22 /pmc/articles/PMC8459517/ /pubmed/34551715 http://dx.doi.org/10.1186/s12864-021-07971-x Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Wang, Yongli
Li, Xia
Wang, Congsheng
Gao, Lu
Wu, Yanfang
Ni, Xingnan
Sun, Jianzhong
Jiang, Jianxiong
Unveiling the transcriptomic complexity of Miscanthus sinensis using a combination of PacBio long read- and Illumina short read sequencing platforms
title Unveiling the transcriptomic complexity of Miscanthus sinensis using a combination of PacBio long read- and Illumina short read sequencing platforms
title_full Unveiling the transcriptomic complexity of Miscanthus sinensis using a combination of PacBio long read- and Illumina short read sequencing platforms
title_fullStr Unveiling the transcriptomic complexity of Miscanthus sinensis using a combination of PacBio long read- and Illumina short read sequencing platforms
title_full_unstemmed Unveiling the transcriptomic complexity of Miscanthus sinensis using a combination of PacBio long read- and Illumina short read sequencing platforms
title_short Unveiling the transcriptomic complexity of Miscanthus sinensis using a combination of PacBio long read- and Illumina short read sequencing platforms
title_sort unveiling the transcriptomic complexity of miscanthus sinensis using a combination of pacbio long read- and illumina short read sequencing platforms
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8459517/
https://www.ncbi.nlm.nih.gov/pubmed/34551715
http://dx.doi.org/10.1186/s12864-021-07971-x
work_keys_str_mv AT wangyongli unveilingthetranscriptomiccomplexityofmiscanthussinensisusingacombinationofpacbiolongreadandilluminashortreadsequencingplatforms
AT lixia unveilingthetranscriptomiccomplexityofmiscanthussinensisusingacombinationofpacbiolongreadandilluminashortreadsequencingplatforms
AT wangcongsheng unveilingthetranscriptomiccomplexityofmiscanthussinensisusingacombinationofpacbiolongreadandilluminashortreadsequencingplatforms
AT gaolu unveilingthetranscriptomiccomplexityofmiscanthussinensisusingacombinationofpacbiolongreadandilluminashortreadsequencingplatforms
AT wuyanfang unveilingthetranscriptomiccomplexityofmiscanthussinensisusingacombinationofpacbiolongreadandilluminashortreadsequencingplatforms
AT nixingnan unveilingthetranscriptomiccomplexityofmiscanthussinensisusingacombinationofpacbiolongreadandilluminashortreadsequencingplatforms
AT sunjianzhong unveilingthetranscriptomiccomplexityofmiscanthussinensisusingacombinationofpacbiolongreadandilluminashortreadsequencingplatforms
AT jiangjianxiong unveilingthetranscriptomiccomplexityofmiscanthussinensisusingacombinationofpacbiolongreadandilluminashortreadsequencingplatforms