Cargando…
A pangenome reference of 36 Chinese populations
Human genomics is witnessing an ongoing paradigm shift from a single reference sequence to a pangenome form, but populations of Asian ancestry are underrepresented. Here we present data from the first phase of the Chinese Pangenome Consortium, including a collection of 116 high-quality and haplotype...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10322713/ https://www.ncbi.nlm.nih.gov/pubmed/37316654 http://dx.doi.org/10.1038/s41586-023-06173-7 |
_version_ | 1785068819381747712 |
---|---|
author | Gao, Yang Yang, Xiaofei Chen, Hao Tan, Xinjiang Yang, Zhaoqing Deng, Lian Wang, Baonan Kong, Shuang Li, Songyang Cui, Yuhang Lei, Chang Wang, Yimin Pan, Yuwen Ma, Sen Sun, Hao Zhao, Xiaohan Shi, Yingbing Yang, Ziyi Wu, Dongdong Wu, Shaoyuan Zhao, Xingming Shi, Binyin Jin, Li Hu, Zhibin Lu, Yan Chu, Jiayou Ye, Kai Xu, Shuhua |
author_facet | Gao, Yang Yang, Xiaofei Chen, Hao Tan, Xinjiang Yang, Zhaoqing Deng, Lian Wang, Baonan Kong, Shuang Li, Songyang Cui, Yuhang Lei, Chang Wang, Yimin Pan, Yuwen Ma, Sen Sun, Hao Zhao, Xiaohan Shi, Yingbing Yang, Ziyi Wu, Dongdong Wu, Shaoyuan Zhao, Xingming Shi, Binyin Jin, Li Hu, Zhibin Lu, Yan Chu, Jiayou Ye, Kai Xu, Shuhua |
author_sort | Gao, Yang |
collection | PubMed |
description | Human genomics is witnessing an ongoing paradigm shift from a single reference sequence to a pangenome form, but populations of Asian ancestry are underrepresented. Here we present data from the first phase of the Chinese Pangenome Consortium, including a collection of 116 high-quality and haplotype-phased de novo assemblies based on 58 core samples representing 36 minority Chinese ethnic groups. With an average 30.65× high-fidelity long-read sequence coverage, an average contiguity N50 of more than 35.63 megabases and an average total size of 3.01 gigabases, the CPC core assemblies add 189 million base pairs of euchromatic polymorphic sequences and 1,367 protein-coding gene duplications to GRCh38. We identified 15.9 million small variants and 78,072 structural variants, of which 5.9 million small variants and 34,223 structural variants were not reported in a recently released pangenome reference(1). The Chinese Pangenome Consortium data demonstrate a remarkable increase in the discovery of novel and missing sequences when individuals are included from underrepresented minority ethnic groups. The missing reference sequences were enriched with archaic-derived alleles and genes that confer essential functions related to keratinization, response to ultraviolet radiation, DNA repair, immunological responses and lifespan, implying great potential for shedding new light on human evolution and recovering missing heritability in complex disease mapping. |
format | Online Article Text |
id | pubmed-10322713 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-103227132023-07-07 A pangenome reference of 36 Chinese populations Gao, Yang Yang, Xiaofei Chen, Hao Tan, Xinjiang Yang, Zhaoqing Deng, Lian Wang, Baonan Kong, Shuang Li, Songyang Cui, Yuhang Lei, Chang Wang, Yimin Pan, Yuwen Ma, Sen Sun, Hao Zhao, Xiaohan Shi, Yingbing Yang, Ziyi Wu, Dongdong Wu, Shaoyuan Zhao, Xingming Shi, Binyin Jin, Li Hu, Zhibin Lu, Yan Chu, Jiayou Ye, Kai Xu, Shuhua Nature Article Human genomics is witnessing an ongoing paradigm shift from a single reference sequence to a pangenome form, but populations of Asian ancestry are underrepresented. Here we present data from the first phase of the Chinese Pangenome Consortium, including a collection of 116 high-quality and haplotype-phased de novo assemblies based on 58 core samples representing 36 minority Chinese ethnic groups. With an average 30.65× high-fidelity long-read sequence coverage, an average contiguity N50 of more than 35.63 megabases and an average total size of 3.01 gigabases, the CPC core assemblies add 189 million base pairs of euchromatic polymorphic sequences and 1,367 protein-coding gene duplications to GRCh38. We identified 15.9 million small variants and 78,072 structural variants, of which 5.9 million small variants and 34,223 structural variants were not reported in a recently released pangenome reference(1). The Chinese Pangenome Consortium data demonstrate a remarkable increase in the discovery of novel and missing sequences when individuals are included from underrepresented minority ethnic groups. The missing reference sequences were enriched with archaic-derived alleles and genes that confer essential functions related to keratinization, response to ultraviolet radiation, DNA repair, immunological responses and lifespan, implying great potential for shedding new light on human evolution and recovering missing heritability in complex disease mapping. Nature Publishing Group UK 2023-06-14 2023 /pmc/articles/PMC10322713/ /pubmed/37316654 http://dx.doi.org/10.1038/s41586-023-06173-7 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Gao, Yang Yang, Xiaofei Chen, Hao Tan, Xinjiang Yang, Zhaoqing Deng, Lian Wang, Baonan Kong, Shuang Li, Songyang Cui, Yuhang Lei, Chang Wang, Yimin Pan, Yuwen Ma, Sen Sun, Hao Zhao, Xiaohan Shi, Yingbing Yang, Ziyi Wu, Dongdong Wu, Shaoyuan Zhao, Xingming Shi, Binyin Jin, Li Hu, Zhibin Lu, Yan Chu, Jiayou Ye, Kai Xu, Shuhua A pangenome reference of 36 Chinese populations |
title | A pangenome reference of 36 Chinese populations |
title_full | A pangenome reference of 36 Chinese populations |
title_fullStr | A pangenome reference of 36 Chinese populations |
title_full_unstemmed | A pangenome reference of 36 Chinese populations |
title_short | A pangenome reference of 36 Chinese populations |
title_sort | pangenome reference of 36 chinese populations |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10322713/ https://www.ncbi.nlm.nih.gov/pubmed/37316654 http://dx.doi.org/10.1038/s41586-023-06173-7 |
work_keys_str_mv | AT gaoyang apangenomereferenceof36chinesepopulations AT yangxiaofei apangenomereferenceof36chinesepopulations AT chenhao apangenomereferenceof36chinesepopulations AT tanxinjiang apangenomereferenceof36chinesepopulations AT yangzhaoqing apangenomereferenceof36chinesepopulations AT denglian apangenomereferenceof36chinesepopulations AT wangbaonan apangenomereferenceof36chinesepopulations AT kongshuang apangenomereferenceof36chinesepopulations AT lisongyang apangenomereferenceof36chinesepopulations AT cuiyuhang apangenomereferenceof36chinesepopulations AT leichang apangenomereferenceof36chinesepopulations AT wangyimin apangenomereferenceof36chinesepopulations AT panyuwen apangenomereferenceof36chinesepopulations AT masen apangenomereferenceof36chinesepopulations AT sunhao apangenomereferenceof36chinesepopulations AT zhaoxiaohan apangenomereferenceof36chinesepopulations AT shiyingbing apangenomereferenceof36chinesepopulations AT yangziyi apangenomereferenceof36chinesepopulations AT wudongdong apangenomereferenceof36chinesepopulations AT wushaoyuan apangenomereferenceof36chinesepopulations AT zhaoxingming apangenomereferenceof36chinesepopulations AT shibinyin apangenomereferenceof36chinesepopulations AT jinli apangenomereferenceof36chinesepopulations AT huzhibin apangenomereferenceof36chinesepopulations AT apangenomereferenceof36chinesepopulations AT luyan apangenomereferenceof36chinesepopulations AT chujiayou apangenomereferenceof36chinesepopulations AT yekai apangenomereferenceof36chinesepopulations AT xushuhua apangenomereferenceof36chinesepopulations AT gaoyang pangenomereferenceof36chinesepopulations AT yangxiaofei pangenomereferenceof36chinesepopulations AT chenhao pangenomereferenceof36chinesepopulations AT tanxinjiang pangenomereferenceof36chinesepopulations AT yangzhaoqing pangenomereferenceof36chinesepopulations AT denglian pangenomereferenceof36chinesepopulations AT wangbaonan pangenomereferenceof36chinesepopulations AT kongshuang pangenomereferenceof36chinesepopulations AT lisongyang pangenomereferenceof36chinesepopulations AT cuiyuhang pangenomereferenceof36chinesepopulations AT leichang pangenomereferenceof36chinesepopulations AT wangyimin pangenomereferenceof36chinesepopulations AT panyuwen pangenomereferenceof36chinesepopulations AT masen pangenomereferenceof36chinesepopulations AT sunhao pangenomereferenceof36chinesepopulations AT zhaoxiaohan pangenomereferenceof36chinesepopulations AT shiyingbing pangenomereferenceof36chinesepopulations AT yangziyi pangenomereferenceof36chinesepopulations AT wudongdong pangenomereferenceof36chinesepopulations AT wushaoyuan pangenomereferenceof36chinesepopulations AT zhaoxingming pangenomereferenceof36chinesepopulations AT shibinyin pangenomereferenceof36chinesepopulations AT jinli pangenomereferenceof36chinesepopulations AT huzhibin pangenomereferenceof36chinesepopulations AT pangenomereferenceof36chinesepopulations AT luyan pangenomereferenceof36chinesepopulations AT chujiayou pangenomereferenceof36chinesepopulations AT yekai pangenomereferenceof36chinesepopulations AT xushuhua pangenomereferenceof36chinesepopulations |