Cargando…
Long read sequencing to reveal the full complexity of a plant transcriptome by targeting both standard and long workflows
BACKGROUND: Long read sequencing allows the analysis of full-length transcripts in plants without the challenges of reliable transcriptome assembly. Long read sequencing of transcripts from plant genomes has often utilized sized transcript libraries. However, the value of including libraries of diff...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10589961/ https://www.ncbi.nlm.nih.gov/pubmed/37865785 http://dx.doi.org/10.1186/s13007-023-01091-1 |
_version_ | 1785123896325832704 |
---|---|
author | Al-Dossary, Othman Furtado, Agnelo KharabianMasouleh, Ardashir Alsubaie, Bader Al-Mssallem, Ibrahim Henry, Robert J. |
author_facet | Al-Dossary, Othman Furtado, Agnelo KharabianMasouleh, Ardashir Alsubaie, Bader Al-Mssallem, Ibrahim Henry, Robert J. |
author_sort | Al-Dossary, Othman |
collection | PubMed |
description | BACKGROUND: Long read sequencing allows the analysis of full-length transcripts in plants without the challenges of reliable transcriptome assembly. Long read sequencing of transcripts from plant genomes has often utilized sized transcript libraries. However, the value of including libraries of differing sizes has not been established. METHODS: A comprehensive transcriptome of the leaves of Jojoba (Simmondsia chinensis) was generated from two different PacBio library preparations: standard workflow (SW) and long workflow (LW). RESULTS: The importance of using both transcript groups in the analysis was demonstrated by the high proportion of unique sequences (74.6%) that were not shared between the groups. A total of 37.8% longer transcripts were only detected in the long dataset. The completeness of the combined transcriptome was indicated by the presence of 98.7% of genes predicted in the jojoba male reference genome. The high coverage of the transcriptome was further confirmed by BUSCO analysis showing the presence of 96.9% of the genes from the core viridiplantae_odb10 lineage. The high-quality isoforms post Cd-Hit merged dataset of the two workflows had a total of 167,866 isoforms. Most of the transcript isoforms were protein-coding sequences (71.7%) containing open reading frames (ORFs) ≥ 100 amino acids (aa). Alternative splicing and intron retention were the basis of most transcript diversity when analysed at the whole genome level and by specific analysis of the apetala2 gene families. CONCLUSION: This suggests the need to specifically target the capture of longer transcripts to provide more comprehensive genome coverage in plant transcriptome analysis and reveal the high level of alternative splicing. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13007-023-01091-1. |
format | Online Article Text |
id | pubmed-10589961 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-105899612023-10-22 Long read sequencing to reveal the full complexity of a plant transcriptome by targeting both standard and long workflows Al-Dossary, Othman Furtado, Agnelo KharabianMasouleh, Ardashir Alsubaie, Bader Al-Mssallem, Ibrahim Henry, Robert J. Plant Methods Methodology BACKGROUND: Long read sequencing allows the analysis of full-length transcripts in plants without the challenges of reliable transcriptome assembly. Long read sequencing of transcripts from plant genomes has often utilized sized transcript libraries. However, the value of including libraries of differing sizes has not been established. METHODS: A comprehensive transcriptome of the leaves of Jojoba (Simmondsia chinensis) was generated from two different PacBio library preparations: standard workflow (SW) and long workflow (LW). RESULTS: The importance of using both transcript groups in the analysis was demonstrated by the high proportion of unique sequences (74.6%) that were not shared between the groups. A total of 37.8% longer transcripts were only detected in the long dataset. The completeness of the combined transcriptome was indicated by the presence of 98.7% of genes predicted in the jojoba male reference genome. The high coverage of the transcriptome was further confirmed by BUSCO analysis showing the presence of 96.9% of the genes from the core viridiplantae_odb10 lineage. The high-quality isoforms post Cd-Hit merged dataset of the two workflows had a total of 167,866 isoforms. Most of the transcript isoforms were protein-coding sequences (71.7%) containing open reading frames (ORFs) ≥ 100 amino acids (aa). Alternative splicing and intron retention were the basis of most transcript diversity when analysed at the whole genome level and by specific analysis of the apetala2 gene families. CONCLUSION: This suggests the need to specifically target the capture of longer transcripts to provide more comprehensive genome coverage in plant transcriptome analysis and reveal the high level of alternative splicing. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13007-023-01091-1. BioMed Central 2023-10-21 /pmc/articles/PMC10589961/ /pubmed/37865785 http://dx.doi.org/10.1186/s13007-023-01091-1 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Methodology Al-Dossary, Othman Furtado, Agnelo KharabianMasouleh, Ardashir Alsubaie, Bader Al-Mssallem, Ibrahim Henry, Robert J. Long read sequencing to reveal the full complexity of a plant transcriptome by targeting both standard and long workflows |
title | Long read sequencing to reveal the full complexity of a plant transcriptome by targeting both standard and long workflows |
title_full | Long read sequencing to reveal the full complexity of a plant transcriptome by targeting both standard and long workflows |
title_fullStr | Long read sequencing to reveal the full complexity of a plant transcriptome by targeting both standard and long workflows |
title_full_unstemmed | Long read sequencing to reveal the full complexity of a plant transcriptome by targeting both standard and long workflows |
title_short | Long read sequencing to reveal the full complexity of a plant transcriptome by targeting both standard and long workflows |
title_sort | long read sequencing to reveal the full complexity of a plant transcriptome by targeting both standard and long workflows |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10589961/ https://www.ncbi.nlm.nih.gov/pubmed/37865785 http://dx.doi.org/10.1186/s13007-023-01091-1 |
work_keys_str_mv | AT aldossaryothman longreadsequencingtorevealthefullcomplexityofaplanttranscriptomebytargetingbothstandardandlongworkflows AT furtadoagnelo longreadsequencingtorevealthefullcomplexityofaplanttranscriptomebytargetingbothstandardandlongworkflows AT kharabianmasoulehardashir longreadsequencingtorevealthefullcomplexityofaplanttranscriptomebytargetingbothstandardandlongworkflows AT alsubaiebader longreadsequencingtorevealthefullcomplexityofaplanttranscriptomebytargetingbothstandardandlongworkflows AT almssallemibrahim longreadsequencingtorevealthefullcomplexityofaplanttranscriptomebytargetingbothstandardandlongworkflows AT henryrobertj longreadsequencingtorevealthefullcomplexityofaplanttranscriptomebytargetingbothstandardandlongworkflows |