Cargando…

Full-length SMRT transcriptome sequencing and microsatellite characterization in Paulownia catalpifolia

Paulownia catalpifolia is an important, fast-growing timber species known for its high density, color and texture. However, few transcriptomic and genetic studies have been conducted in P. catalpifolia. In this study, single-molecule real-time sequencing technology was applied to obtain the full-len...

Descripción completa

Detalles Bibliográficos
Autores principales: Feng, Yanzhi, Zhao, Yang, Zhang, Jiajia, Wang, Baoping, Yang, Chaowei, Zhou, Haijiang, Qiao, Jie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8062547/
https://www.ncbi.nlm.nih.gov/pubmed/33888729
http://dx.doi.org/10.1038/s41598-021-87538-8
_version_ 1783681789435314176
author Feng, Yanzhi
Zhao, Yang
Zhang, Jiajia
Wang, Baoping
Yang, Chaowei
Zhou, Haijiang
Qiao, Jie
author_facet Feng, Yanzhi
Zhao, Yang
Zhang, Jiajia
Wang, Baoping
Yang, Chaowei
Zhou, Haijiang
Qiao, Jie
author_sort Feng, Yanzhi
collection PubMed
description Paulownia catalpifolia is an important, fast-growing timber species known for its high density, color and texture. However, few transcriptomic and genetic studies have been conducted in P. catalpifolia. In this study, single-molecule real-time sequencing technology was applied to obtain the full-length transcriptome of P. catalpifolia leaves treated with varying degrees of drought stress. The sequencing data were then used to search for microsatellites, or simple sequence repeats (SSRs). A total of 28.83 Gb data were generated, 25,969 high-quality (HQ) transcripts with an average length of 1624 bp were acquired after removing the redundant reads, and 25,602 HQ transcripts (98.59%) were annotated using public databases. Among the HQ transcripts, 16,722 intact coding sequences, 149 long non-coding RNAs and 179 alternative splicing events were predicted, respectively. A total of 7367 SSR loci were distributed throughout 6293 HQ transcripts, of which 763 complex SSRs and 6604 complete SSRs. The SSR appearance frequency was 28.37%, and the average distribution distance was 5.59 kb. Among the 6604 complete SSR loci, 1–3 nucleotide repeats were dominant, occupying 97.85% of the total SSR loci, of which mono-, di- and tri-nucleotide repeats were 44.68%, 33.86% and 19.31%, respectively. We detected 112 repeat motifs, of which A/T (42.64%), AG/CT (12.22%), GA/TC (9.63%), GAA/TTC (1.57%) and CCA/TGG (1.54%) were most common in mono-, di- and tri-nucleotide repeats, respectively. The length of the repeat SSR motifs was 10–88 bp, and 4997 (75.67%) were ≤ 20 bp. This study provides a novel full-length transcriptome reference for P. catalpifolia and will facilitate the identification of germplasm resources and breeding of new drought-resistant P. catalpifolia varieties.
format Online
Article
Text
id pubmed-8062547
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-80625472021-04-23 Full-length SMRT transcriptome sequencing and microsatellite characterization in Paulownia catalpifolia Feng, Yanzhi Zhao, Yang Zhang, Jiajia Wang, Baoping Yang, Chaowei Zhou, Haijiang Qiao, Jie Sci Rep Article Paulownia catalpifolia is an important, fast-growing timber species known for its high density, color and texture. However, few transcriptomic and genetic studies have been conducted in P. catalpifolia. In this study, single-molecule real-time sequencing technology was applied to obtain the full-length transcriptome of P. catalpifolia leaves treated with varying degrees of drought stress. The sequencing data were then used to search for microsatellites, or simple sequence repeats (SSRs). A total of 28.83 Gb data were generated, 25,969 high-quality (HQ) transcripts with an average length of 1624 bp were acquired after removing the redundant reads, and 25,602 HQ transcripts (98.59%) were annotated using public databases. Among the HQ transcripts, 16,722 intact coding sequences, 149 long non-coding RNAs and 179 alternative splicing events were predicted, respectively. A total of 7367 SSR loci were distributed throughout 6293 HQ transcripts, of which 763 complex SSRs and 6604 complete SSRs. The SSR appearance frequency was 28.37%, and the average distribution distance was 5.59 kb. Among the 6604 complete SSR loci, 1–3 nucleotide repeats were dominant, occupying 97.85% of the total SSR loci, of which mono-, di- and tri-nucleotide repeats were 44.68%, 33.86% and 19.31%, respectively. We detected 112 repeat motifs, of which A/T (42.64%), AG/CT (12.22%), GA/TC (9.63%), GAA/TTC (1.57%) and CCA/TGG (1.54%) were most common in mono-, di- and tri-nucleotide repeats, respectively. The length of the repeat SSR motifs was 10–88 bp, and 4997 (75.67%) were ≤ 20 bp. This study provides a novel full-length transcriptome reference for P. catalpifolia and will facilitate the identification of germplasm resources and breeding of new drought-resistant P. catalpifolia varieties. Nature Publishing Group UK 2021-04-22 /pmc/articles/PMC8062547/ /pubmed/33888729 http://dx.doi.org/10.1038/s41598-021-87538-8 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Feng, Yanzhi
Zhao, Yang
Zhang, Jiajia
Wang, Baoping
Yang, Chaowei
Zhou, Haijiang
Qiao, Jie
Full-length SMRT transcriptome sequencing and microsatellite characterization in Paulownia catalpifolia
title Full-length SMRT transcriptome sequencing and microsatellite characterization in Paulownia catalpifolia
title_full Full-length SMRT transcriptome sequencing and microsatellite characterization in Paulownia catalpifolia
title_fullStr Full-length SMRT transcriptome sequencing and microsatellite characterization in Paulownia catalpifolia
title_full_unstemmed Full-length SMRT transcriptome sequencing and microsatellite characterization in Paulownia catalpifolia
title_short Full-length SMRT transcriptome sequencing and microsatellite characterization in Paulownia catalpifolia
title_sort full-length smrt transcriptome sequencing and microsatellite characterization in paulownia catalpifolia
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8062547/
https://www.ncbi.nlm.nih.gov/pubmed/33888729
http://dx.doi.org/10.1038/s41598-021-87538-8
work_keys_str_mv AT fengyanzhi fulllengthsmrttranscriptomesequencingandmicrosatellitecharacterizationinpaulowniacatalpifolia
AT zhaoyang fulllengthsmrttranscriptomesequencingandmicrosatellitecharacterizationinpaulowniacatalpifolia
AT zhangjiajia fulllengthsmrttranscriptomesequencingandmicrosatellitecharacterizationinpaulowniacatalpifolia
AT wangbaoping fulllengthsmrttranscriptomesequencingandmicrosatellitecharacterizationinpaulowniacatalpifolia
AT yangchaowei fulllengthsmrttranscriptomesequencingandmicrosatellitecharacterizationinpaulowniacatalpifolia
AT zhouhaijiang fulllengthsmrttranscriptomesequencingandmicrosatellitecharacterizationinpaulowniacatalpifolia
AT qiaojie fulllengthsmrttranscriptomesequencingandmicrosatellitecharacterizationinpaulowniacatalpifolia