Cargando…
Full-length SMRT transcriptome sequencing and microsatellite characterization in Paulownia catalpifolia
Paulownia catalpifolia is an important, fast-growing timber species known for its high density, color and texture. However, few transcriptomic and genetic studies have been conducted in P. catalpifolia. In this study, single-molecule real-time sequencing technology was applied to obtain the full-len...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8062547/ https://www.ncbi.nlm.nih.gov/pubmed/33888729 http://dx.doi.org/10.1038/s41598-021-87538-8 |
_version_ | 1783681789435314176 |
---|---|
author | Feng, Yanzhi Zhao, Yang Zhang, Jiajia Wang, Baoping Yang, Chaowei Zhou, Haijiang Qiao, Jie |
author_facet | Feng, Yanzhi Zhao, Yang Zhang, Jiajia Wang, Baoping Yang, Chaowei Zhou, Haijiang Qiao, Jie |
author_sort | Feng, Yanzhi |
collection | PubMed |
description | Paulownia catalpifolia is an important, fast-growing timber species known for its high density, color and texture. However, few transcriptomic and genetic studies have been conducted in P. catalpifolia. In this study, single-molecule real-time sequencing technology was applied to obtain the full-length transcriptome of P. catalpifolia leaves treated with varying degrees of drought stress. The sequencing data were then used to search for microsatellites, or simple sequence repeats (SSRs). A total of 28.83 Gb data were generated, 25,969 high-quality (HQ) transcripts with an average length of 1624 bp were acquired after removing the redundant reads, and 25,602 HQ transcripts (98.59%) were annotated using public databases. Among the HQ transcripts, 16,722 intact coding sequences, 149 long non-coding RNAs and 179 alternative splicing events were predicted, respectively. A total of 7367 SSR loci were distributed throughout 6293 HQ transcripts, of which 763 complex SSRs and 6604 complete SSRs. The SSR appearance frequency was 28.37%, and the average distribution distance was 5.59 kb. Among the 6604 complete SSR loci, 1–3 nucleotide repeats were dominant, occupying 97.85% of the total SSR loci, of which mono-, di- and tri-nucleotide repeats were 44.68%, 33.86% and 19.31%, respectively. We detected 112 repeat motifs, of which A/T (42.64%), AG/CT (12.22%), GA/TC (9.63%), GAA/TTC (1.57%) and CCA/TGG (1.54%) were most common in mono-, di- and tri-nucleotide repeats, respectively. The length of the repeat SSR motifs was 10–88 bp, and 4997 (75.67%) were ≤ 20 bp. This study provides a novel full-length transcriptome reference for P. catalpifolia and will facilitate the identification of germplasm resources and breeding of new drought-resistant P. catalpifolia varieties. |
format | Online Article Text |
id | pubmed-8062547 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-80625472021-04-23 Full-length SMRT transcriptome sequencing and microsatellite characterization in Paulownia catalpifolia Feng, Yanzhi Zhao, Yang Zhang, Jiajia Wang, Baoping Yang, Chaowei Zhou, Haijiang Qiao, Jie Sci Rep Article Paulownia catalpifolia is an important, fast-growing timber species known for its high density, color and texture. However, few transcriptomic and genetic studies have been conducted in P. catalpifolia. In this study, single-molecule real-time sequencing technology was applied to obtain the full-length transcriptome of P. catalpifolia leaves treated with varying degrees of drought stress. The sequencing data were then used to search for microsatellites, or simple sequence repeats (SSRs). A total of 28.83 Gb data were generated, 25,969 high-quality (HQ) transcripts with an average length of 1624 bp were acquired after removing the redundant reads, and 25,602 HQ transcripts (98.59%) were annotated using public databases. Among the HQ transcripts, 16,722 intact coding sequences, 149 long non-coding RNAs and 179 alternative splicing events were predicted, respectively. A total of 7367 SSR loci were distributed throughout 6293 HQ transcripts, of which 763 complex SSRs and 6604 complete SSRs. The SSR appearance frequency was 28.37%, and the average distribution distance was 5.59 kb. Among the 6604 complete SSR loci, 1–3 nucleotide repeats were dominant, occupying 97.85% of the total SSR loci, of which mono-, di- and tri-nucleotide repeats were 44.68%, 33.86% and 19.31%, respectively. We detected 112 repeat motifs, of which A/T (42.64%), AG/CT (12.22%), GA/TC (9.63%), GAA/TTC (1.57%) and CCA/TGG (1.54%) were most common in mono-, di- and tri-nucleotide repeats, respectively. The length of the repeat SSR motifs was 10–88 bp, and 4997 (75.67%) were ≤ 20 bp. This study provides a novel full-length transcriptome reference for P. catalpifolia and will facilitate the identification of germplasm resources and breeding of new drought-resistant P. catalpifolia varieties. Nature Publishing Group UK 2021-04-22 /pmc/articles/PMC8062547/ /pubmed/33888729 http://dx.doi.org/10.1038/s41598-021-87538-8 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Feng, Yanzhi Zhao, Yang Zhang, Jiajia Wang, Baoping Yang, Chaowei Zhou, Haijiang Qiao, Jie Full-length SMRT transcriptome sequencing and microsatellite characterization in Paulownia catalpifolia |
title | Full-length SMRT transcriptome sequencing and microsatellite characterization in Paulownia catalpifolia |
title_full | Full-length SMRT transcriptome sequencing and microsatellite characterization in Paulownia catalpifolia |
title_fullStr | Full-length SMRT transcriptome sequencing and microsatellite characterization in Paulownia catalpifolia |
title_full_unstemmed | Full-length SMRT transcriptome sequencing and microsatellite characterization in Paulownia catalpifolia |
title_short | Full-length SMRT transcriptome sequencing and microsatellite characterization in Paulownia catalpifolia |
title_sort | full-length smrt transcriptome sequencing and microsatellite characterization in paulownia catalpifolia |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8062547/ https://www.ncbi.nlm.nih.gov/pubmed/33888729 http://dx.doi.org/10.1038/s41598-021-87538-8 |
work_keys_str_mv | AT fengyanzhi fulllengthsmrttranscriptomesequencingandmicrosatellitecharacterizationinpaulowniacatalpifolia AT zhaoyang fulllengthsmrttranscriptomesequencingandmicrosatellitecharacterizationinpaulowniacatalpifolia AT zhangjiajia fulllengthsmrttranscriptomesequencingandmicrosatellitecharacterizationinpaulowniacatalpifolia AT wangbaoping fulllengthsmrttranscriptomesequencingandmicrosatellitecharacterizationinpaulowniacatalpifolia AT yangchaowei fulllengthsmrttranscriptomesequencingandmicrosatellitecharacterizationinpaulowniacatalpifolia AT zhouhaijiang fulllengthsmrttranscriptomesequencingandmicrosatellitecharacterizationinpaulowniacatalpifolia AT qiaojie fulllengthsmrttranscriptomesequencingandmicrosatellitecharacterizationinpaulowniacatalpifolia |