Cargando…

A pangenome reference of 36 Chinese populations

Human genomics is witnessing an ongoing paradigm shift from a single reference sequence to a pangenome form, but populations of Asian ancestry are underrepresented. Here we present data from the first phase of the Chinese Pangenome Consortium, including a collection of 116 high-quality and haplotype...

Descripción completa

Detalles Bibliográficos
Autores principales: Gao, Yang, Yang, Xiaofei, Chen, Hao, Tan, Xinjiang, Yang, Zhaoqing, Deng, Lian, Wang, Baonan, Kong, Shuang, Li, Songyang, Cui, Yuhang, Lei, Chang, Wang, Yimin, Pan, Yuwen, Ma, Sen, Sun, Hao, Zhao, Xiaohan, Shi, Yingbing, Yang, Ziyi, Wu, Dongdong, Wu, Shaoyuan, Zhao, Xingming, Shi, Binyin, Jin, Li, Hu, Zhibin, Lu, Yan, Chu, Jiayou, Ye, Kai, Xu, Shuhua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10322713/
https://www.ncbi.nlm.nih.gov/pubmed/37316654
http://dx.doi.org/10.1038/s41586-023-06173-7
_version_ 1785068819381747712
author Gao, Yang
Yang, Xiaofei
Chen, Hao
Tan, Xinjiang
Yang, Zhaoqing
Deng, Lian
Wang, Baonan
Kong, Shuang
Li, Songyang
Cui, Yuhang
Lei, Chang
Wang, Yimin
Pan, Yuwen
Ma, Sen
Sun, Hao
Zhao, Xiaohan
Shi, Yingbing
Yang, Ziyi
Wu, Dongdong
Wu, Shaoyuan
Zhao, Xingming
Shi, Binyin
Jin, Li
Hu, Zhibin
Lu, Yan
Chu, Jiayou
Ye, Kai
Xu, Shuhua
author_facet Gao, Yang
Yang, Xiaofei
Chen, Hao
Tan, Xinjiang
Yang, Zhaoqing
Deng, Lian
Wang, Baonan
Kong, Shuang
Li, Songyang
Cui, Yuhang
Lei, Chang
Wang, Yimin
Pan, Yuwen
Ma, Sen
Sun, Hao
Zhao, Xiaohan
Shi, Yingbing
Yang, Ziyi
Wu, Dongdong
Wu, Shaoyuan
Zhao, Xingming
Shi, Binyin
Jin, Li
Hu, Zhibin
Lu, Yan
Chu, Jiayou
Ye, Kai
Xu, Shuhua
author_sort Gao, Yang
collection PubMed
description Human genomics is witnessing an ongoing paradigm shift from a single reference sequence to a pangenome form, but populations of Asian ancestry are underrepresented. Here we present data from the first phase of the Chinese Pangenome Consortium, including a collection of 116 high-quality and haplotype-phased de novo assemblies based on 58 core samples representing 36 minority Chinese ethnic groups. With an average 30.65× high-fidelity long-read sequence coverage, an average contiguity N50 of more than 35.63 megabases and an average total size of 3.01 gigabases, the CPC core assemblies add 189 million base pairs of euchromatic polymorphic sequences and 1,367 protein-coding gene duplications to GRCh38. We identified 15.9 million small variants and 78,072 structural variants, of which 5.9 million small variants and 34,223 structural variants were not reported in a recently released pangenome reference(1). The Chinese Pangenome Consortium data demonstrate a remarkable increase in the discovery of novel and missing sequences when individuals are included from underrepresented minority ethnic groups. The missing reference sequences were enriched with archaic-derived alleles and genes that confer essential functions related to keratinization, response to ultraviolet radiation, DNA repair, immunological responses and lifespan, implying great potential for shedding new light on human evolution and recovering missing heritability in complex disease mapping.
format Online
Article
Text
id pubmed-10322713
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-103227132023-07-07 A pangenome reference of 36 Chinese populations Gao, Yang Yang, Xiaofei Chen, Hao Tan, Xinjiang Yang, Zhaoqing Deng, Lian Wang, Baonan Kong, Shuang Li, Songyang Cui, Yuhang Lei, Chang Wang, Yimin Pan, Yuwen Ma, Sen Sun, Hao Zhao, Xiaohan Shi, Yingbing Yang, Ziyi Wu, Dongdong Wu, Shaoyuan Zhao, Xingming Shi, Binyin Jin, Li Hu, Zhibin Lu, Yan Chu, Jiayou Ye, Kai Xu, Shuhua Nature Article Human genomics is witnessing an ongoing paradigm shift from a single reference sequence to a pangenome form, but populations of Asian ancestry are underrepresented. Here we present data from the first phase of the Chinese Pangenome Consortium, including a collection of 116 high-quality and haplotype-phased de novo assemblies based on 58 core samples representing 36 minority Chinese ethnic groups. With an average 30.65× high-fidelity long-read sequence coverage, an average contiguity N50 of more than 35.63 megabases and an average total size of 3.01 gigabases, the CPC core assemblies add 189 million base pairs of euchromatic polymorphic sequences and 1,367 protein-coding gene duplications to GRCh38. We identified 15.9 million small variants and 78,072 structural variants, of which 5.9 million small variants and 34,223 structural variants were not reported in a recently released pangenome reference(1). The Chinese Pangenome Consortium data demonstrate a remarkable increase in the discovery of novel and missing sequences when individuals are included from underrepresented minority ethnic groups. The missing reference sequences were enriched with archaic-derived alleles and genes that confer essential functions related to keratinization, response to ultraviolet radiation, DNA repair, immunological responses and lifespan, implying great potential for shedding new light on human evolution and recovering missing heritability in complex disease mapping. Nature Publishing Group UK 2023-06-14 2023 /pmc/articles/PMC10322713/ /pubmed/37316654 http://dx.doi.org/10.1038/s41586-023-06173-7 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Gao, Yang
Yang, Xiaofei
Chen, Hao
Tan, Xinjiang
Yang, Zhaoqing
Deng, Lian
Wang, Baonan
Kong, Shuang
Li, Songyang
Cui, Yuhang
Lei, Chang
Wang, Yimin
Pan, Yuwen
Ma, Sen
Sun, Hao
Zhao, Xiaohan
Shi, Yingbing
Yang, Ziyi
Wu, Dongdong
Wu, Shaoyuan
Zhao, Xingming
Shi, Binyin
Jin, Li
Hu, Zhibin
Lu, Yan
Chu, Jiayou
Ye, Kai
Xu, Shuhua
A pangenome reference of 36 Chinese populations
title A pangenome reference of 36 Chinese populations
title_full A pangenome reference of 36 Chinese populations
title_fullStr A pangenome reference of 36 Chinese populations
title_full_unstemmed A pangenome reference of 36 Chinese populations
title_short A pangenome reference of 36 Chinese populations
title_sort pangenome reference of 36 chinese populations
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10322713/
https://www.ncbi.nlm.nih.gov/pubmed/37316654
http://dx.doi.org/10.1038/s41586-023-06173-7
work_keys_str_mv AT gaoyang apangenomereferenceof36chinesepopulations
AT yangxiaofei apangenomereferenceof36chinesepopulations
AT chenhao apangenomereferenceof36chinesepopulations
AT tanxinjiang apangenomereferenceof36chinesepopulations
AT yangzhaoqing apangenomereferenceof36chinesepopulations
AT denglian apangenomereferenceof36chinesepopulations
AT wangbaonan apangenomereferenceof36chinesepopulations
AT kongshuang apangenomereferenceof36chinesepopulations
AT lisongyang apangenomereferenceof36chinesepopulations
AT cuiyuhang apangenomereferenceof36chinesepopulations
AT leichang apangenomereferenceof36chinesepopulations
AT wangyimin apangenomereferenceof36chinesepopulations
AT panyuwen apangenomereferenceof36chinesepopulations
AT masen apangenomereferenceof36chinesepopulations
AT sunhao apangenomereferenceof36chinesepopulations
AT zhaoxiaohan apangenomereferenceof36chinesepopulations
AT shiyingbing apangenomereferenceof36chinesepopulations
AT yangziyi apangenomereferenceof36chinesepopulations
AT wudongdong apangenomereferenceof36chinesepopulations
AT wushaoyuan apangenomereferenceof36chinesepopulations
AT zhaoxingming apangenomereferenceof36chinesepopulations
AT shibinyin apangenomereferenceof36chinesepopulations
AT jinli apangenomereferenceof36chinesepopulations
AT huzhibin apangenomereferenceof36chinesepopulations
AT apangenomereferenceof36chinesepopulations
AT luyan apangenomereferenceof36chinesepopulations
AT chujiayou apangenomereferenceof36chinesepopulations
AT yekai apangenomereferenceof36chinesepopulations
AT xushuhua apangenomereferenceof36chinesepopulations
AT gaoyang pangenomereferenceof36chinesepopulations
AT yangxiaofei pangenomereferenceof36chinesepopulations
AT chenhao pangenomereferenceof36chinesepopulations
AT tanxinjiang pangenomereferenceof36chinesepopulations
AT yangzhaoqing pangenomereferenceof36chinesepopulations
AT denglian pangenomereferenceof36chinesepopulations
AT wangbaonan pangenomereferenceof36chinesepopulations
AT kongshuang pangenomereferenceof36chinesepopulations
AT lisongyang pangenomereferenceof36chinesepopulations
AT cuiyuhang pangenomereferenceof36chinesepopulations
AT leichang pangenomereferenceof36chinesepopulations
AT wangyimin pangenomereferenceof36chinesepopulations
AT panyuwen pangenomereferenceof36chinesepopulations
AT masen pangenomereferenceof36chinesepopulations
AT sunhao pangenomereferenceof36chinesepopulations
AT zhaoxiaohan pangenomereferenceof36chinesepopulations
AT shiyingbing pangenomereferenceof36chinesepopulations
AT yangziyi pangenomereferenceof36chinesepopulations
AT wudongdong pangenomereferenceof36chinesepopulations
AT wushaoyuan pangenomereferenceof36chinesepopulations
AT zhaoxingming pangenomereferenceof36chinesepopulations
AT shibinyin pangenomereferenceof36chinesepopulations
AT jinli pangenomereferenceof36chinesepopulations
AT huzhibin pangenomereferenceof36chinesepopulations
AT pangenomereferenceof36chinesepopulations
AT luyan pangenomereferenceof36chinesepopulations
AT chujiayou pangenomereferenceof36chinesepopulations
AT yekai pangenomereferenceof36chinesepopulations
AT xushuhua pangenomereferenceof36chinesepopulations