Cargando…

Large-scale collection and annotation of gene models for date palm (Phoenix dactylifera, L.)

The date palm (Phoenix dactylifera L.), famed for its sugar-rich fruits (dates) and cultivated by humans since 4,000 B.C., is an economically important crop in the Middle East, Northern Africa, and increasingly other places where climates are suitable. Despite a long history of human cultivation, th...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Guangyu, Pan, Linlin, Yin, Yuxin, Liu, Wanfei, Huang, Dawei, Zhang, Tongwu, Wang, Lei, Xin, Chengqi, Lin, Qiang, Sun, Gaoyuan, Ba Abdullah, Mohammed M., Zhang, Xiaowei, Hu, Songnian, Al-Mssallem, Ibrahim S., Yu, Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Netherlands 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402680/
https://www.ncbi.nlm.nih.gov/pubmed/22736259
http://dx.doi.org/10.1007/s11103-012-9924-z
_version_ 1782238780507291648
author Zhang, Guangyu
Pan, Linlin
Yin, Yuxin
Liu, Wanfei
Huang, Dawei
Zhang, Tongwu
Wang, Lei
Xin, Chengqi
Lin, Qiang
Sun, Gaoyuan
Ba Abdullah, Mohammed M.
Zhang, Xiaowei
Hu, Songnian
Al-Mssallem, Ibrahim S.
Yu, Jun
author_facet Zhang, Guangyu
Pan, Linlin
Yin, Yuxin
Liu, Wanfei
Huang, Dawei
Zhang, Tongwu
Wang, Lei
Xin, Chengqi
Lin, Qiang
Sun, Gaoyuan
Ba Abdullah, Mohammed M.
Zhang, Xiaowei
Hu, Songnian
Al-Mssallem, Ibrahim S.
Yu, Jun
author_sort Zhang, Guangyu
collection PubMed
description The date palm (Phoenix dactylifera L.), famed for its sugar-rich fruits (dates) and cultivated by humans since 4,000 B.C., is an economically important crop in the Middle East, Northern Africa, and increasingly other places where climates are suitable. Despite a long history of human cultivation, the understanding of P. dactylifera genetics and molecular biology are rather limited, hindered by lack of basic data in high quality from genomics and transcriptomics. Here we report a large-scale effort in generating gene models (assembled expressed sequence tags or ESTs and mapped to a genome assembly) for P. dactylifera, using the long-read pyrosequencing platform (Roche/454 GS FLX Titanium) in high coverage. We built fourteen cDNA libraries from different P. dactylifera tissues (cultivar Khalas) and acquired 15,778,993 raw sequencing reads—about one million sequencing reads per library—and the pooled sequences were assembled into 67,651 non-redundant contigs and 301,978 singletons. We annotated 52,725 contigs based on the plant databases and 45 contigs based on functional domains referencing to the Pfam database. From the annotated contigs, we assigned GO (Gene Ontology) terms to 36,086 contigs and KEGG pathways to 7,032 contigs. Our comparative analysis showed that 70.6 % (47,930), 69.4 % (47,089), 68.4 % (46,441), and 69.3 % (47,048) of the P. dactylifera gene models are shared with rice, sorghum, Arabidopsis, and grapevine, respectively. We also assigned our gene models into house-keeping and tissue-specific genes based on their tissue specificity. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s11103-012-9924-z) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-3402680
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Springer Netherlands
record_format MEDLINE/PubMed
spelling pubmed-34026802012-07-26 Large-scale collection and annotation of gene models for date palm (Phoenix dactylifera, L.) Zhang, Guangyu Pan, Linlin Yin, Yuxin Liu, Wanfei Huang, Dawei Zhang, Tongwu Wang, Lei Xin, Chengqi Lin, Qiang Sun, Gaoyuan Ba Abdullah, Mohammed M. Zhang, Xiaowei Hu, Songnian Al-Mssallem, Ibrahim S. Yu, Jun Plant Mol Biol Article The date palm (Phoenix dactylifera L.), famed for its sugar-rich fruits (dates) and cultivated by humans since 4,000 B.C., is an economically important crop in the Middle East, Northern Africa, and increasingly other places where climates are suitable. Despite a long history of human cultivation, the understanding of P. dactylifera genetics and molecular biology are rather limited, hindered by lack of basic data in high quality from genomics and transcriptomics. Here we report a large-scale effort in generating gene models (assembled expressed sequence tags or ESTs and mapped to a genome assembly) for P. dactylifera, using the long-read pyrosequencing platform (Roche/454 GS FLX Titanium) in high coverage. We built fourteen cDNA libraries from different P. dactylifera tissues (cultivar Khalas) and acquired 15,778,993 raw sequencing reads—about one million sequencing reads per library—and the pooled sequences were assembled into 67,651 non-redundant contigs and 301,978 singletons. We annotated 52,725 contigs based on the plant databases and 45 contigs based on functional domains referencing to the Pfam database. From the annotated contigs, we assigned GO (Gene Ontology) terms to 36,086 contigs and KEGG pathways to 7,032 contigs. Our comparative analysis showed that 70.6 % (47,930), 69.4 % (47,089), 68.4 % (46,441), and 69.3 % (47,048) of the P. dactylifera gene models are shared with rice, sorghum, Arabidopsis, and grapevine, respectively. We also assigned our gene models into house-keeping and tissue-specific genes based on their tissue specificity. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s11103-012-9924-z) contains supplementary material, which is available to authorized users. Springer Netherlands 2012-06-27 2012 /pmc/articles/PMC3402680/ /pubmed/22736259 http://dx.doi.org/10.1007/s11103-012-9924-z Text en © The Author(s) 2012 https://creativecommons.org/licenses/by/4.0/ This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
spellingShingle Article
Zhang, Guangyu
Pan, Linlin
Yin, Yuxin
Liu, Wanfei
Huang, Dawei
Zhang, Tongwu
Wang, Lei
Xin, Chengqi
Lin, Qiang
Sun, Gaoyuan
Ba Abdullah, Mohammed M.
Zhang, Xiaowei
Hu, Songnian
Al-Mssallem, Ibrahim S.
Yu, Jun
Large-scale collection and annotation of gene models for date palm (Phoenix dactylifera, L.)
title Large-scale collection and annotation of gene models for date palm (Phoenix dactylifera, L.)
title_full Large-scale collection and annotation of gene models for date palm (Phoenix dactylifera, L.)
title_fullStr Large-scale collection and annotation of gene models for date palm (Phoenix dactylifera, L.)
title_full_unstemmed Large-scale collection and annotation of gene models for date palm (Phoenix dactylifera, L.)
title_short Large-scale collection and annotation of gene models for date palm (Phoenix dactylifera, L.)
title_sort large-scale collection and annotation of gene models for date palm (phoenix dactylifera, l.)
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402680/
https://www.ncbi.nlm.nih.gov/pubmed/22736259
http://dx.doi.org/10.1007/s11103-012-9924-z
work_keys_str_mv AT zhangguangyu largescalecollectionandannotationofgenemodelsfordatepalmphoenixdactyliferal
AT panlinlin largescalecollectionandannotationofgenemodelsfordatepalmphoenixdactyliferal
AT yinyuxin largescalecollectionandannotationofgenemodelsfordatepalmphoenixdactyliferal
AT liuwanfei largescalecollectionandannotationofgenemodelsfordatepalmphoenixdactyliferal
AT huangdawei largescalecollectionandannotationofgenemodelsfordatepalmphoenixdactyliferal
AT zhangtongwu largescalecollectionandannotationofgenemodelsfordatepalmphoenixdactyliferal
AT wanglei largescalecollectionandannotationofgenemodelsfordatepalmphoenixdactyliferal
AT xinchengqi largescalecollectionandannotationofgenemodelsfordatepalmphoenixdactyliferal
AT linqiang largescalecollectionandannotationofgenemodelsfordatepalmphoenixdactyliferal
AT sungaoyuan largescalecollectionandannotationofgenemodelsfordatepalmphoenixdactyliferal
AT baabdullahmohammedm largescalecollectionandannotationofgenemodelsfordatepalmphoenixdactyliferal
AT zhangxiaowei largescalecollectionandannotationofgenemodelsfordatepalmphoenixdactyliferal
AT husongnian largescalecollectionandannotationofgenemodelsfordatepalmphoenixdactyliferal
AT almssallemibrahims largescalecollectionandannotationofgenemodelsfordatepalmphoenixdactyliferal
AT yujun largescalecollectionandannotationofgenemodelsfordatepalmphoenixdactyliferal