Cargando…

A pangenome analysis pipeline provides insights into functional gene identification in rice

BACKGROUND: A pangenome aims to capture the complete genetic diversity within a species and reduce bias in genetic analysis inherent in using a single reference genome. However, the current linear format of most plant pangenomes limits the presentation of position information for novel sequences. Gr...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Jian, Yang, Wu, Zhang, Shaohong, Hu, Haifei, Yuan, Yuxuan, Dong, Jingfang, Chen, Luo, Ma, Yamei, Yang, Tifeng, Zhou, Lian, Chen, Jiansong, Liu, Bin, Li, Chengdao, Edwards, David, Zhao, Junliang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9878884/
https://www.ncbi.nlm.nih.gov/pubmed/36703158
http://dx.doi.org/10.1186/s13059-023-02861-9
_version_ 1784878583449124864
author Wang, Jian
Yang, Wu
Zhang, Shaohong
Hu, Haifei
Yuan, Yuxuan
Dong, Jingfang
Chen, Luo
Ma, Yamei
Yang, Tifeng
Zhou, Lian
Chen, Jiansong
Liu, Bin
Li, Chengdao
Edwards, David
Zhao, Junliang
author_facet Wang, Jian
Yang, Wu
Zhang, Shaohong
Hu, Haifei
Yuan, Yuxuan
Dong, Jingfang
Chen, Luo
Ma, Yamei
Yang, Tifeng
Zhou, Lian
Chen, Jiansong
Liu, Bin
Li, Chengdao
Edwards, David
Zhao, Junliang
author_sort Wang, Jian
collection PubMed
description BACKGROUND: A pangenome aims to capture the complete genetic diversity within a species and reduce bias in genetic analysis inherent in using a single reference genome. However, the current linear format of most plant pangenomes limits the presentation of position information for novel sequences. Graph pangenomes have been developed to overcome this limitation. However, bioinformatics analysis tools for graph format genomes are lacking. RESULTS: To overcome this problem, we develop a novel strategy for pangenome construction and a downstream pangenome analysis pipeline (PSVCP) that captures genetic variants’ position information while maintaining a linearized layout. Using PSVCP, we construct a high-quality rice pangenome using 12 representative rice genomes and analyze an international rice panel with 413 diverse accessions using the pangenome as the reference. We show that PSVCP successfully identifies causal structural variations for rice grain weight and plant height. Our results provide insights into rice population structure and genomic diversity. We characterize a new locus (qPH8-1) associated with plant height on chromosome 8 undetected by the SNP-based genome-wide association study (GWAS). CONCLUSIONS: Our results demonstrate that the pangenome constructed by our pipeline combined with a presence and absence variation-based GWAS can provide additional power for genomic and genetic analysis. The pangenome constructed in this study and the associated genome sequence and genetic variants data provide valuable genomic resources for rice genomics research and improvement in future. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-02861-9.
format Online
Article
Text
id pubmed-9878884
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-98788842023-01-27 A pangenome analysis pipeline provides insights into functional gene identification in rice Wang, Jian Yang, Wu Zhang, Shaohong Hu, Haifei Yuan, Yuxuan Dong, Jingfang Chen, Luo Ma, Yamei Yang, Tifeng Zhou, Lian Chen, Jiansong Liu, Bin Li, Chengdao Edwards, David Zhao, Junliang Genome Biol Research BACKGROUND: A pangenome aims to capture the complete genetic diversity within a species and reduce bias in genetic analysis inherent in using a single reference genome. However, the current linear format of most plant pangenomes limits the presentation of position information for novel sequences. Graph pangenomes have been developed to overcome this limitation. However, bioinformatics analysis tools for graph format genomes are lacking. RESULTS: To overcome this problem, we develop a novel strategy for pangenome construction and a downstream pangenome analysis pipeline (PSVCP) that captures genetic variants’ position information while maintaining a linearized layout. Using PSVCP, we construct a high-quality rice pangenome using 12 representative rice genomes and analyze an international rice panel with 413 diverse accessions using the pangenome as the reference. We show that PSVCP successfully identifies causal structural variations for rice grain weight and plant height. Our results provide insights into rice population structure and genomic diversity. We characterize a new locus (qPH8-1) associated with plant height on chromosome 8 undetected by the SNP-based genome-wide association study (GWAS). CONCLUSIONS: Our results demonstrate that the pangenome constructed by our pipeline combined with a presence and absence variation-based GWAS can provide additional power for genomic and genetic analysis. The pangenome constructed in this study and the associated genome sequence and genetic variants data provide valuable genomic resources for rice genomics research and improvement in future. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-02861-9. BioMed Central 2023-01-26 /pmc/articles/PMC9878884/ /pubmed/36703158 http://dx.doi.org/10.1186/s13059-023-02861-9 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Wang, Jian
Yang, Wu
Zhang, Shaohong
Hu, Haifei
Yuan, Yuxuan
Dong, Jingfang
Chen, Luo
Ma, Yamei
Yang, Tifeng
Zhou, Lian
Chen, Jiansong
Liu, Bin
Li, Chengdao
Edwards, David
Zhao, Junliang
A pangenome analysis pipeline provides insights into functional gene identification in rice
title A pangenome analysis pipeline provides insights into functional gene identification in rice
title_full A pangenome analysis pipeline provides insights into functional gene identification in rice
title_fullStr A pangenome analysis pipeline provides insights into functional gene identification in rice
title_full_unstemmed A pangenome analysis pipeline provides insights into functional gene identification in rice
title_short A pangenome analysis pipeline provides insights into functional gene identification in rice
title_sort pangenome analysis pipeline provides insights into functional gene identification in rice
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9878884/
https://www.ncbi.nlm.nih.gov/pubmed/36703158
http://dx.doi.org/10.1186/s13059-023-02861-9
work_keys_str_mv AT wangjian apangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT yangwu apangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT zhangshaohong apangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT huhaifei apangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT yuanyuxuan apangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT dongjingfang apangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT chenluo apangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT mayamei apangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT yangtifeng apangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT zhoulian apangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT chenjiansong apangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT liubin apangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT lichengdao apangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT edwardsdavid apangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT zhaojunliang apangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT wangjian pangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT yangwu pangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT zhangshaohong pangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT huhaifei pangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT yuanyuxuan pangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT dongjingfang pangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT chenluo pangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT mayamei pangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT yangtifeng pangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT zhoulian pangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT chenjiansong pangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT liubin pangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT lichengdao pangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT edwardsdavid pangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice
AT zhaojunliang pangenomeanalysispipelineprovidesinsightsintofunctionalgeneidentificationinrice