Cargando…
The tea plant reference genome and improved gene annotation using long-read and paired-end sequencing data
Tea is a globally consumed non-alcohol beverage with great economic importance. However, lack of the reference genome has largely hampered the utilization of precious tea plant genetic resources towards breeding. To address this issue, we previously generated a high-quality reference genome of tea p...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6629666/ https://www.ncbi.nlm.nih.gov/pubmed/31308375 http://dx.doi.org/10.1038/s41597-019-0127-1 |
_version_ | 1783435137562705920 |
---|---|
author | Xia, Enhua Li, Fangdong Tong, Wei Yang, Hua Wang, Songbo Zhao, Jian Liu, Chun Gao, Liping Tai, Yuling She, Guangbiao Sun, Jun Cao, Haisheng Gao, Qiang Li, Yeyun Deng, Weiwei Jiang, Xiaolan Wang, Wenzhao Chen, Qi Zhang, Shihua Li, Haijing Wu, Junlan Wang, Ping Li, Penghui Shi, Chengying Zheng, Fengya Jian, Jianbo Huang, Bei Shan, Dai Shi, Mingming Fang, Congbing Yue, Yi Wu, Qiong Ge, Ruoheng Zhao, Huijuan Li, Daxiang Wei, Shu Han, Bin Jiang, Changjun Yin, Ye Xia, Tao Zhang, Zhengzhu Zhao, Shancen Bennetzen, Jeffrey L. Wei, Chaoling Wan, Xiaochun |
author_facet | Xia, Enhua Li, Fangdong Tong, Wei Yang, Hua Wang, Songbo Zhao, Jian Liu, Chun Gao, Liping Tai, Yuling She, Guangbiao Sun, Jun Cao, Haisheng Gao, Qiang Li, Yeyun Deng, Weiwei Jiang, Xiaolan Wang, Wenzhao Chen, Qi Zhang, Shihua Li, Haijing Wu, Junlan Wang, Ping Li, Penghui Shi, Chengying Zheng, Fengya Jian, Jianbo Huang, Bei Shan, Dai Shi, Mingming Fang, Congbing Yue, Yi Wu, Qiong Ge, Ruoheng Zhao, Huijuan Li, Daxiang Wei, Shu Han, Bin Jiang, Changjun Yin, Ye Xia, Tao Zhang, Zhengzhu Zhao, Shancen Bennetzen, Jeffrey L. Wei, Chaoling Wan, Xiaochun |
author_sort | Xia, Enhua |
collection | PubMed |
description | Tea is a globally consumed non-alcohol beverage with great economic importance. However, lack of the reference genome has largely hampered the utilization of precious tea plant genetic resources towards breeding. To address this issue, we previously generated a high-quality reference genome of tea plant using Illumina and PacBio sequencing technology, which produced a total of 2,124 Gb short and 125 Gb long read data, respectively. A hybrid strategy was employed to assemble the tea genome that has been publicly released. We here described the data framework used to generate, annotate and validate the genome assembly. Besides, we re-predicted the protein-coding genes and annotated their putative functions using more comprehensive omics datasets with improved training models. We reassessed the assembly and annotation quality using the latest version of BUSCO. These data can be utilized to develop new methodologies/tools for better assembly of complex genomes, aid in finding of novel genes, variations and evolutionary clues associated with tea quality, thus help to breed new varieties with high yield and better quality in the future. |
format | Online Article Text |
id | pubmed-6629666 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-66296662019-07-16 The tea plant reference genome and improved gene annotation using long-read and paired-end sequencing data Xia, Enhua Li, Fangdong Tong, Wei Yang, Hua Wang, Songbo Zhao, Jian Liu, Chun Gao, Liping Tai, Yuling She, Guangbiao Sun, Jun Cao, Haisheng Gao, Qiang Li, Yeyun Deng, Weiwei Jiang, Xiaolan Wang, Wenzhao Chen, Qi Zhang, Shihua Li, Haijing Wu, Junlan Wang, Ping Li, Penghui Shi, Chengying Zheng, Fengya Jian, Jianbo Huang, Bei Shan, Dai Shi, Mingming Fang, Congbing Yue, Yi Wu, Qiong Ge, Ruoheng Zhao, Huijuan Li, Daxiang Wei, Shu Han, Bin Jiang, Changjun Yin, Ye Xia, Tao Zhang, Zhengzhu Zhao, Shancen Bennetzen, Jeffrey L. Wei, Chaoling Wan, Xiaochun Sci Data Data Descriptor Tea is a globally consumed non-alcohol beverage with great economic importance. However, lack of the reference genome has largely hampered the utilization of precious tea plant genetic resources towards breeding. To address this issue, we previously generated a high-quality reference genome of tea plant using Illumina and PacBio sequencing technology, which produced a total of 2,124 Gb short and 125 Gb long read data, respectively. A hybrid strategy was employed to assemble the tea genome that has been publicly released. We here described the data framework used to generate, annotate and validate the genome assembly. Besides, we re-predicted the protein-coding genes and annotated their putative functions using more comprehensive omics datasets with improved training models. We reassessed the assembly and annotation quality using the latest version of BUSCO. These data can be utilized to develop new methodologies/tools for better assembly of complex genomes, aid in finding of novel genes, variations and evolutionary clues associated with tea quality, thus help to breed new varieties with high yield and better quality in the future. Nature Publishing Group UK 2019-07-15 /pmc/articles/PMC6629666/ /pubmed/31308375 http://dx.doi.org/10.1038/s41597-019-0127-1 Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article. |
spellingShingle | Data Descriptor Xia, Enhua Li, Fangdong Tong, Wei Yang, Hua Wang, Songbo Zhao, Jian Liu, Chun Gao, Liping Tai, Yuling She, Guangbiao Sun, Jun Cao, Haisheng Gao, Qiang Li, Yeyun Deng, Weiwei Jiang, Xiaolan Wang, Wenzhao Chen, Qi Zhang, Shihua Li, Haijing Wu, Junlan Wang, Ping Li, Penghui Shi, Chengying Zheng, Fengya Jian, Jianbo Huang, Bei Shan, Dai Shi, Mingming Fang, Congbing Yue, Yi Wu, Qiong Ge, Ruoheng Zhao, Huijuan Li, Daxiang Wei, Shu Han, Bin Jiang, Changjun Yin, Ye Xia, Tao Zhang, Zhengzhu Zhao, Shancen Bennetzen, Jeffrey L. Wei, Chaoling Wan, Xiaochun The tea plant reference genome and improved gene annotation using long-read and paired-end sequencing data |
title | The tea plant reference genome and improved gene annotation using long-read and paired-end sequencing data |
title_full | The tea plant reference genome and improved gene annotation using long-read and paired-end sequencing data |
title_fullStr | The tea plant reference genome and improved gene annotation using long-read and paired-end sequencing data |
title_full_unstemmed | The tea plant reference genome and improved gene annotation using long-read and paired-end sequencing data |
title_short | The tea plant reference genome and improved gene annotation using long-read and paired-end sequencing data |
title_sort | tea plant reference genome and improved gene annotation using long-read and paired-end sequencing data |
topic | Data Descriptor |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6629666/ https://www.ncbi.nlm.nih.gov/pubmed/31308375 http://dx.doi.org/10.1038/s41597-019-0127-1 |
work_keys_str_mv | AT xiaenhua theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT lifangdong theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT tongwei theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT yanghua theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT wangsongbo theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT zhaojian theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT liuchun theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT gaoliping theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT taiyuling theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT sheguangbiao theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT sunjun theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT caohaisheng theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT gaoqiang theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT liyeyun theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT dengweiwei theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT jiangxiaolan theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT wangwenzhao theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT chenqi theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT zhangshihua theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT lihaijing theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT wujunlan theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT wangping theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT lipenghui theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT shichengying theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT zhengfengya theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT jianjianbo theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT huangbei theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT shandai theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT shimingming theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT fangcongbing theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT yueyi theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT wuqiong theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT geruoheng theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT zhaohuijuan theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT lidaxiang theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT weishu theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT hanbin theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT jiangchangjun theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT yinye theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT xiatao theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT zhangzhengzhu theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT zhaoshancen theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT bennetzenjeffreyl theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT weichaoling theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT wanxiaochun theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT xiaenhua teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT lifangdong teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT tongwei teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT yanghua teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT wangsongbo teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT zhaojian teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT liuchun teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT gaoliping teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT taiyuling teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT sheguangbiao teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT sunjun teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT caohaisheng teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT gaoqiang teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT liyeyun teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT dengweiwei teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT jiangxiaolan teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT wangwenzhao teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT chenqi teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT zhangshihua teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT lihaijing teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT wujunlan teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT wangping teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT lipenghui teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT shichengying teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT zhengfengya teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT jianjianbo teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT huangbei teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT shandai teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT shimingming teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT fangcongbing teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT yueyi teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT wuqiong teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT geruoheng teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT zhaohuijuan teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT lidaxiang teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT weishu teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT hanbin teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT jiangchangjun teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT yinye teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT xiatao teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT zhangzhengzhu teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT zhaoshancen teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT bennetzenjeffreyl teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT weichaoling teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata AT wanxiaochun teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata |