Cargando…

The tea plant reference genome and improved gene annotation using long-read and paired-end sequencing data

Tea is a globally consumed non-alcohol beverage with great economic importance. However, lack of the reference genome has largely hampered the utilization of precious tea plant genetic resources towards breeding. To address this issue, we previously generated a high-quality reference genome of tea p...

Descripción completa

Detalles Bibliográficos
Autores principales: Xia, Enhua, Li, Fangdong, Tong, Wei, Yang, Hua, Wang, Songbo, Zhao, Jian, Liu, Chun, Gao, Liping, Tai, Yuling, She, Guangbiao, Sun, Jun, Cao, Haisheng, Gao, Qiang, Li, Yeyun, Deng, Weiwei, Jiang, Xiaolan, Wang, Wenzhao, Chen, Qi, Zhang, Shihua, Li, Haijing, Wu, Junlan, Wang, Ping, Li, Penghui, Shi, Chengying, Zheng, Fengya, Jian, Jianbo, Huang, Bei, Shan, Dai, Shi, Mingming, Fang, Congbing, Yue, Yi, Wu, Qiong, Ge, Ruoheng, Zhao, Huijuan, Li, Daxiang, Wei, Shu, Han, Bin, Jiang, Changjun, Yin, Ye, Xia, Tao, Zhang, Zhengzhu, Zhao, Shancen, Bennetzen, Jeffrey L., Wei, Chaoling, Wan, Xiaochun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6629666/
https://www.ncbi.nlm.nih.gov/pubmed/31308375
http://dx.doi.org/10.1038/s41597-019-0127-1
_version_ 1783435137562705920
author Xia, Enhua
Li, Fangdong
Tong, Wei
Yang, Hua
Wang, Songbo
Zhao, Jian
Liu, Chun
Gao, Liping
Tai, Yuling
She, Guangbiao
Sun, Jun
Cao, Haisheng
Gao, Qiang
Li, Yeyun
Deng, Weiwei
Jiang, Xiaolan
Wang, Wenzhao
Chen, Qi
Zhang, Shihua
Li, Haijing
Wu, Junlan
Wang, Ping
Li, Penghui
Shi, Chengying
Zheng, Fengya
Jian, Jianbo
Huang, Bei
Shan, Dai
Shi, Mingming
Fang, Congbing
Yue, Yi
Wu, Qiong
Ge, Ruoheng
Zhao, Huijuan
Li, Daxiang
Wei, Shu
Han, Bin
Jiang, Changjun
Yin, Ye
Xia, Tao
Zhang, Zhengzhu
Zhao, Shancen
Bennetzen, Jeffrey L.
Wei, Chaoling
Wan, Xiaochun
author_facet Xia, Enhua
Li, Fangdong
Tong, Wei
Yang, Hua
Wang, Songbo
Zhao, Jian
Liu, Chun
Gao, Liping
Tai, Yuling
She, Guangbiao
Sun, Jun
Cao, Haisheng
Gao, Qiang
Li, Yeyun
Deng, Weiwei
Jiang, Xiaolan
Wang, Wenzhao
Chen, Qi
Zhang, Shihua
Li, Haijing
Wu, Junlan
Wang, Ping
Li, Penghui
Shi, Chengying
Zheng, Fengya
Jian, Jianbo
Huang, Bei
Shan, Dai
Shi, Mingming
Fang, Congbing
Yue, Yi
Wu, Qiong
Ge, Ruoheng
Zhao, Huijuan
Li, Daxiang
Wei, Shu
Han, Bin
Jiang, Changjun
Yin, Ye
Xia, Tao
Zhang, Zhengzhu
Zhao, Shancen
Bennetzen, Jeffrey L.
Wei, Chaoling
Wan, Xiaochun
author_sort Xia, Enhua
collection PubMed
description Tea is a globally consumed non-alcohol beverage with great economic importance. However, lack of the reference genome has largely hampered the utilization of precious tea plant genetic resources towards breeding. To address this issue, we previously generated a high-quality reference genome of tea plant using Illumina and PacBio sequencing technology, which produced a total of 2,124 Gb short and 125 Gb long read data, respectively. A hybrid strategy was employed to assemble the tea genome that has been publicly released. We here described the data framework used to generate, annotate and validate the genome assembly. Besides, we re-predicted the protein-coding genes and annotated their putative functions using more comprehensive omics datasets with improved training models. We reassessed the assembly and annotation quality using the latest version of BUSCO. These data can be utilized to develop new methodologies/tools for better assembly of complex genomes, aid in finding of novel genes, variations and evolutionary clues associated with tea quality, thus help to breed new varieties with high yield and better quality in the future.
format Online
Article
Text
id pubmed-6629666
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-66296662019-07-16 The tea plant reference genome and improved gene annotation using long-read and paired-end sequencing data Xia, Enhua Li, Fangdong Tong, Wei Yang, Hua Wang, Songbo Zhao, Jian Liu, Chun Gao, Liping Tai, Yuling She, Guangbiao Sun, Jun Cao, Haisheng Gao, Qiang Li, Yeyun Deng, Weiwei Jiang, Xiaolan Wang, Wenzhao Chen, Qi Zhang, Shihua Li, Haijing Wu, Junlan Wang, Ping Li, Penghui Shi, Chengying Zheng, Fengya Jian, Jianbo Huang, Bei Shan, Dai Shi, Mingming Fang, Congbing Yue, Yi Wu, Qiong Ge, Ruoheng Zhao, Huijuan Li, Daxiang Wei, Shu Han, Bin Jiang, Changjun Yin, Ye Xia, Tao Zhang, Zhengzhu Zhao, Shancen Bennetzen, Jeffrey L. Wei, Chaoling Wan, Xiaochun Sci Data Data Descriptor Tea is a globally consumed non-alcohol beverage with great economic importance. However, lack of the reference genome has largely hampered the utilization of precious tea plant genetic resources towards breeding. To address this issue, we previously generated a high-quality reference genome of tea plant using Illumina and PacBio sequencing technology, which produced a total of 2,124 Gb short and 125 Gb long read data, respectively. A hybrid strategy was employed to assemble the tea genome that has been publicly released. We here described the data framework used to generate, annotate and validate the genome assembly. Besides, we re-predicted the protein-coding genes and annotated their putative functions using more comprehensive omics datasets with improved training models. We reassessed the assembly and annotation quality using the latest version of BUSCO. These data can be utilized to develop new methodologies/tools for better assembly of complex genomes, aid in finding of novel genes, variations and evolutionary clues associated with tea quality, thus help to breed new varieties with high yield and better quality in the future. Nature Publishing Group UK 2019-07-15 /pmc/articles/PMC6629666/ /pubmed/31308375 http://dx.doi.org/10.1038/s41597-019-0127-1 Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article.
spellingShingle Data Descriptor
Xia, Enhua
Li, Fangdong
Tong, Wei
Yang, Hua
Wang, Songbo
Zhao, Jian
Liu, Chun
Gao, Liping
Tai, Yuling
She, Guangbiao
Sun, Jun
Cao, Haisheng
Gao, Qiang
Li, Yeyun
Deng, Weiwei
Jiang, Xiaolan
Wang, Wenzhao
Chen, Qi
Zhang, Shihua
Li, Haijing
Wu, Junlan
Wang, Ping
Li, Penghui
Shi, Chengying
Zheng, Fengya
Jian, Jianbo
Huang, Bei
Shan, Dai
Shi, Mingming
Fang, Congbing
Yue, Yi
Wu, Qiong
Ge, Ruoheng
Zhao, Huijuan
Li, Daxiang
Wei, Shu
Han, Bin
Jiang, Changjun
Yin, Ye
Xia, Tao
Zhang, Zhengzhu
Zhao, Shancen
Bennetzen, Jeffrey L.
Wei, Chaoling
Wan, Xiaochun
The tea plant reference genome and improved gene annotation using long-read and paired-end sequencing data
title The tea plant reference genome and improved gene annotation using long-read and paired-end sequencing data
title_full The tea plant reference genome and improved gene annotation using long-read and paired-end sequencing data
title_fullStr The tea plant reference genome and improved gene annotation using long-read and paired-end sequencing data
title_full_unstemmed The tea plant reference genome and improved gene annotation using long-read and paired-end sequencing data
title_short The tea plant reference genome and improved gene annotation using long-read and paired-end sequencing data
title_sort tea plant reference genome and improved gene annotation using long-read and paired-end sequencing data
topic Data Descriptor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6629666/
https://www.ncbi.nlm.nih.gov/pubmed/31308375
http://dx.doi.org/10.1038/s41597-019-0127-1
work_keys_str_mv AT xiaenhua theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT lifangdong theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT tongwei theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT yanghua theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT wangsongbo theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT zhaojian theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT liuchun theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT gaoliping theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT taiyuling theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT sheguangbiao theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT sunjun theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT caohaisheng theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT gaoqiang theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT liyeyun theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT dengweiwei theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT jiangxiaolan theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT wangwenzhao theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT chenqi theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT zhangshihua theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT lihaijing theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT wujunlan theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT wangping theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT lipenghui theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT shichengying theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT zhengfengya theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT jianjianbo theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT huangbei theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT shandai theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT shimingming theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT fangcongbing theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT yueyi theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT wuqiong theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT geruoheng theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT zhaohuijuan theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT lidaxiang theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT weishu theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT hanbin theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT jiangchangjun theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT yinye theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT xiatao theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT zhangzhengzhu theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT zhaoshancen theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT bennetzenjeffreyl theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT weichaoling theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT wanxiaochun theteaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT xiaenhua teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT lifangdong teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT tongwei teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT yanghua teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT wangsongbo teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT zhaojian teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT liuchun teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT gaoliping teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT taiyuling teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT sheguangbiao teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT sunjun teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT caohaisheng teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT gaoqiang teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT liyeyun teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT dengweiwei teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT jiangxiaolan teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT wangwenzhao teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT chenqi teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT zhangshihua teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT lihaijing teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT wujunlan teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT wangping teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT lipenghui teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT shichengying teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT zhengfengya teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT jianjianbo teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT huangbei teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT shandai teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT shimingming teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT fangcongbing teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT yueyi teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT wuqiong teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT geruoheng teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT zhaohuijuan teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT lidaxiang teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT weishu teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT hanbin teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT jiangchangjun teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT yinye teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT xiatao teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT zhangzhengzhu teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT zhaoshancen teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT bennetzenjeffreyl teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT weichaoling teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata
AT wanxiaochun teaplantreferencegenomeandimprovedgeneannotationusinglongreadandpairedendsequencingdata