Cargando…

Long read sequencing to reveal the full complexity of a plant transcriptome by targeting both standard and long workflows

BACKGROUND: Long read sequencing allows the analysis of full-length transcripts in plants without the challenges of reliable transcriptome assembly. Long read sequencing of transcripts from plant genomes has often utilized sized transcript libraries. However, the value of including libraries of diff...

Descripción completa

Detalles Bibliográficos
Autores principales:	Al-Dossary, Othman, Furtado, Agnelo, KharabianMasouleh, Ardashir, Alsubaie, Bader, Al-Mssallem, Ibrahim, Henry, Robert J.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2023
Materias:	Methodology
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10589961/ https://www.ncbi.nlm.nih.gov/pubmed/37865785 http://dx.doi.org/10.1186/s13007-023-01091-1

_version_	1785123896325832704
author	Al-Dossary, Othman Furtado, Agnelo KharabianMasouleh, Ardashir Alsubaie, Bader Al-Mssallem, Ibrahim Henry, Robert J.
author_facet	Al-Dossary, Othman Furtado, Agnelo KharabianMasouleh, Ardashir Alsubaie, Bader Al-Mssallem, Ibrahim Henry, Robert J.
author_sort	Al-Dossary, Othman
collection	PubMed
description	BACKGROUND: Long read sequencing allows the analysis of full-length transcripts in plants without the challenges of reliable transcriptome assembly. Long read sequencing of transcripts from plant genomes has often utilized sized transcript libraries. However, the value of including libraries of differing sizes has not been established. METHODS: A comprehensive transcriptome of the leaves of Jojoba (Simmondsia chinensis) was generated from two different PacBio library preparations: standard workflow (SW) and long workflow (LW). RESULTS: The importance of using both transcript groups in the analysis was demonstrated by the high proportion of unique sequences (74.6%) that were not shared between the groups. A total of 37.8% longer transcripts were only detected in the long dataset. The completeness of the combined transcriptome was indicated by the presence of 98.7% of genes predicted in the jojoba male reference genome. The high coverage of the transcriptome was further confirmed by BUSCO analysis showing the presence of 96.9% of the genes from the core viridiplantae_odb10 lineage. The high-quality isoforms post Cd-Hit merged dataset of the two workflows had a total of 167,866 isoforms. Most of the transcript isoforms were protein-coding sequences (71.7%) containing open reading frames (ORFs) ≥ 100 amino acids (aa). Alternative splicing and intron retention were the basis of most transcript diversity when analysed at the whole genome level and by specific analysis of the apetala2 gene families. CONCLUSION: This suggests the need to specifically target the capture of longer transcripts to provide more comprehensive genome coverage in plant transcriptome analysis and reveal the high level of alternative splicing. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13007-023-01091-1.
format	Online Article Text
id	pubmed-10589961
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-105899612023-10-22 Long read sequencing to reveal the full complexity of a plant transcriptome by targeting both standard and long workflows Al-Dossary, Othman Furtado, Agnelo KharabianMasouleh, Ardashir Alsubaie, Bader Al-Mssallem, Ibrahim Henry, Robert J. Plant Methods Methodology BACKGROUND: Long read sequencing allows the analysis of full-length transcripts in plants without the challenges of reliable transcriptome assembly. Long read sequencing of transcripts from plant genomes has often utilized sized transcript libraries. However, the value of including libraries of differing sizes has not been established. METHODS: A comprehensive transcriptome of the leaves of Jojoba (Simmondsia chinensis) was generated from two different PacBio library preparations: standard workflow (SW) and long workflow (LW). RESULTS: The importance of using both transcript groups in the analysis was demonstrated by the high proportion of unique sequences (74.6%) that were not shared between the groups. A total of 37.8% longer transcripts were only detected in the long dataset. The completeness of the combined transcriptome was indicated by the presence of 98.7% of genes predicted in the jojoba male reference genome. The high coverage of the transcriptome was further confirmed by BUSCO analysis showing the presence of 96.9% of the genes from the core viridiplantae_odb10 lineage. The high-quality isoforms post Cd-Hit merged dataset of the two workflows had a total of 167,866 isoforms. Most of the transcript isoforms were protein-coding sequences (71.7%) containing open reading frames (ORFs) ≥ 100 amino acids (aa). Alternative splicing and intron retention were the basis of most transcript diversity when analysed at the whole genome level and by specific analysis of the apetala2 gene families. CONCLUSION: This suggests the need to specifically target the capture of longer transcripts to provide more comprehensive genome coverage in plant transcriptome analysis and reveal the high level of alternative splicing. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13007-023-01091-1. BioMed Central 2023-10-21 /pmc/articles/PMC10589961/ /pubmed/37865785 http://dx.doi.org/10.1186/s13007-023-01091-1 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Methodology Al-Dossary, Othman Furtado, Agnelo KharabianMasouleh, Ardashir Alsubaie, Bader Al-Mssallem, Ibrahim Henry, Robert J. Long read sequencing to reveal the full complexity of a plant transcriptome by targeting both standard and long workflows
title	Long read sequencing to reveal the full complexity of a plant transcriptome by targeting both standard and long workflows
title_full	Long read sequencing to reveal the full complexity of a plant transcriptome by targeting both standard and long workflows
title_fullStr	Long read sequencing to reveal the full complexity of a plant transcriptome by targeting both standard and long workflows
title_full_unstemmed	Long read sequencing to reveal the full complexity of a plant transcriptome by targeting both standard and long workflows
title_short	Long read sequencing to reveal the full complexity of a plant transcriptome by targeting both standard and long workflows
title_sort	long read sequencing to reveal the full complexity of a plant transcriptome by targeting both standard and long workflows
topic	Methodology
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10589961/ https://www.ncbi.nlm.nih.gov/pubmed/37865785 http://dx.doi.org/10.1186/s13007-023-01091-1
work_keys_str_mv	AT aldossaryothman longreadsequencingtorevealthefullcomplexityofaplanttranscriptomebytargetingbothstandardandlongworkflows AT furtadoagnelo longreadsequencingtorevealthefullcomplexityofaplanttranscriptomebytargetingbothstandardandlongworkflows AT kharabianmasoulehardashir longreadsequencingtorevealthefullcomplexityofaplanttranscriptomebytargetingbothstandardandlongworkflows AT alsubaiebader longreadsequencingtorevealthefullcomplexityofaplanttranscriptomebytargetingbothstandardandlongworkflows AT almssallemibrahim longreadsequencingtorevealthefullcomplexityofaplanttranscriptomebytargetingbothstandardandlongworkflows AT henryrobertj longreadsequencingtorevealthefullcomplexityofaplanttranscriptomebytargetingbothstandardandlongworkflows

Long read sequencing to reveal the full complexity of a plant transcriptome by targeting both standard and long workflows

Ejemplares similares