Cargando…

DRDB: An Online Date Palm Genomic Resource Database

Background: Date palm (Phoenix dactylifera L.) is a cultivated woody plant with agricultural and economic importance in many countries around the world. With the advantages of next generation sequencing technologies, genome sequences for many date palm cultivars have been released recently. Short se...

Descripción completa

Detalles Bibliográficos
Autores principales: He, Zilong, Zhang, Chengwei, Liu, Wanfei, Lin, Qiang, Wei, Ting, Aljohi, Hasan A., Chen, Wei-Hua, Hu, Songnian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5701633/
https://www.ncbi.nlm.nih.gov/pubmed/29209336
http://dx.doi.org/10.3389/fpls.2017.01889
_version_ 1783281383768064000
author He, Zilong
Zhang, Chengwei
Liu, Wanfei
Lin, Qiang
Wei, Ting
Aljohi, Hasan A.
Chen, Wei-Hua
Hu, Songnian
author_facet He, Zilong
Zhang, Chengwei
Liu, Wanfei
Lin, Qiang
Wei, Ting
Aljohi, Hasan A.
Chen, Wei-Hua
Hu, Songnian
author_sort He, Zilong
collection PubMed
description Background: Date palm (Phoenix dactylifera L.) is a cultivated woody plant with agricultural and economic importance in many countries around the world. With the advantages of next generation sequencing technologies, genome sequences for many date palm cultivars have been released recently. Short sequence repeat (SSR) and single nucleotide polymorphism (SNP) can be identified from these genomic data, and have been proven to be very useful biomarkers in plant genome analysis and breeding. Results: Here, we first improved the date palm genome assembly using 130X of HiSeq data generated in our lab. Then 246,445 SSRs (214,901 SSRs and 31,544 compound SSRs) were annotated in this genome assembly; among the SSRs, mononucleotide SSRs (58.92%) were the most abundant, followed by di- (29.92%), tri- (8.14%), tetra- (2.47%), penta- (0.36%), and hexa-nucleotide SSRs (0.19%). The high-quality PCR primer pairs were designed for most (174,497; 70.81% out of total) SSRs. We also annotated 6,375,806 SNPs with raw read depth≥3 in 90% cultivars. To further reduce false positive SNPs, we only kept 5,572,650 (87.40% out of total) SNPs with at least 20% cultivars support for downstream analyses. The high-quality PCR primer pairs were also obtained for 4,177,778 (65.53%) SNPs. We reconstructed the phylogenetic relationships among the 62 cultivars using these variants and found that they can be divided into three clusters, namely North Africa, Egypt – Sudan, and Middle East – South Asian, with Egypt – Sudan being the admixture of North Africa and Middle East – South Asian cultivars; we further confirmed these clusters using principal component analysis. Moreover, 34,346 SSRs and 4,177,778 SNPs with PCR primers were assigned to shared cultivars for cultivar classification and diversity analysis. All these SSRs, SNPs and their classification are available in our database, and can be used for cultivar identification, comparison, and molecular breeding. Conclusion: DRDB is a comprehensive genomic resource database of date palm. It can serve as a bioinformatics platform for date palm genomics, genetics, and molecular breeding. DRDB is freely available at http://drdb.big.ac.cn/home.
format Online
Article
Text
id pubmed-5701633
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-57016332017-12-05 DRDB: An Online Date Palm Genomic Resource Database He, Zilong Zhang, Chengwei Liu, Wanfei Lin, Qiang Wei, Ting Aljohi, Hasan A. Chen, Wei-Hua Hu, Songnian Front Plant Sci Plant Science Background: Date palm (Phoenix dactylifera L.) is a cultivated woody plant with agricultural and economic importance in many countries around the world. With the advantages of next generation sequencing technologies, genome sequences for many date palm cultivars have been released recently. Short sequence repeat (SSR) and single nucleotide polymorphism (SNP) can be identified from these genomic data, and have been proven to be very useful biomarkers in plant genome analysis and breeding. Results: Here, we first improved the date palm genome assembly using 130X of HiSeq data generated in our lab. Then 246,445 SSRs (214,901 SSRs and 31,544 compound SSRs) were annotated in this genome assembly; among the SSRs, mononucleotide SSRs (58.92%) were the most abundant, followed by di- (29.92%), tri- (8.14%), tetra- (2.47%), penta- (0.36%), and hexa-nucleotide SSRs (0.19%). The high-quality PCR primer pairs were designed for most (174,497; 70.81% out of total) SSRs. We also annotated 6,375,806 SNPs with raw read depth≥3 in 90% cultivars. To further reduce false positive SNPs, we only kept 5,572,650 (87.40% out of total) SNPs with at least 20% cultivars support for downstream analyses. The high-quality PCR primer pairs were also obtained for 4,177,778 (65.53%) SNPs. We reconstructed the phylogenetic relationships among the 62 cultivars using these variants and found that they can be divided into three clusters, namely North Africa, Egypt – Sudan, and Middle East – South Asian, with Egypt – Sudan being the admixture of North Africa and Middle East – South Asian cultivars; we further confirmed these clusters using principal component analysis. Moreover, 34,346 SSRs and 4,177,778 SNPs with PCR primers were assigned to shared cultivars for cultivar classification and diversity analysis. All these SSRs, SNPs and their classification are available in our database, and can be used for cultivar identification, comparison, and molecular breeding. Conclusion: DRDB is a comprehensive genomic resource database of date palm. It can serve as a bioinformatics platform for date palm genomics, genetics, and molecular breeding. DRDB is freely available at http://drdb.big.ac.cn/home. Frontiers Media S.A. 2017-11-02 /pmc/articles/PMC5701633/ /pubmed/29209336 http://dx.doi.org/10.3389/fpls.2017.01889 Text en Copyright © 2017 He, Zhang, Liu, Lin, Wei, Aljohi, Chen and Hu. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Plant Science
He, Zilong
Zhang, Chengwei
Liu, Wanfei
Lin, Qiang
Wei, Ting
Aljohi, Hasan A.
Chen, Wei-Hua
Hu, Songnian
DRDB: An Online Date Palm Genomic Resource Database
title DRDB: An Online Date Palm Genomic Resource Database
title_full DRDB: An Online Date Palm Genomic Resource Database
title_fullStr DRDB: An Online Date Palm Genomic Resource Database
title_full_unstemmed DRDB: An Online Date Palm Genomic Resource Database
title_short DRDB: An Online Date Palm Genomic Resource Database
title_sort drdb: an online date palm genomic resource database
topic Plant Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5701633/
https://www.ncbi.nlm.nih.gov/pubmed/29209336
http://dx.doi.org/10.3389/fpls.2017.01889
work_keys_str_mv AT hezilong drdbanonlinedatepalmgenomicresourcedatabase
AT zhangchengwei drdbanonlinedatepalmgenomicresourcedatabase
AT liuwanfei drdbanonlinedatepalmgenomicresourcedatabase
AT linqiang drdbanonlinedatepalmgenomicresourcedatabase
AT weiting drdbanonlinedatepalmgenomicresourcedatabase
AT aljohihasana drdbanonlinedatepalmgenomicresourcedatabase
AT chenweihua drdbanonlinedatepalmgenomicresourcedatabase
AT husongnian drdbanonlinedatepalmgenomicresourcedatabase