Cargando…

De novo assembly of a Tibetan genome and identification of novel structural variants associated with high-altitude adaptation

Structural variants (SVs) may play important roles in human adaptation to extreme environments such as high altitude but have been under-investigated. Here, combining long-read sequencing with multiple scaffolding techniques, we assembled a high-quality Tibetan genome (ZF1), with a contig N50 length...

Descripción completa

Detalles Bibliográficos
Autores principales: He, Yaoxi, Lou, Haiyi, Cui, Chaoying, Deng, Lian, Gao, Yang, Zheng, Wangshan, Guo, Yongbo, Wang, Xiaoji, Ning, Zhilin, Li, Jun, Li, Bin, Bai, Caijuan, Liu, Shiming, Wu, Tianyi, Xu, Shuhua, Qi, Xuebin, Su, Bing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8288928/
https://www.ncbi.nlm.nih.gov/pubmed/34692055
http://dx.doi.org/10.1093/nsr/nwz160
_version_ 1783724189871505408
author He, Yaoxi
Lou, Haiyi
Cui, Chaoying
Deng, Lian
Gao, Yang
Zheng, Wangshan
Guo, Yongbo
Wang, Xiaoji
Ning, Zhilin
Li, Jun
Li, Bin
Bai, Caijuan
Liu, Shiming
Wu, Tianyi
Xu, Shuhua
Qi, Xuebin
Su, Bing
author_facet He, Yaoxi
Lou, Haiyi
Cui, Chaoying
Deng, Lian
Gao, Yang
Zheng, Wangshan
Guo, Yongbo
Wang, Xiaoji
Ning, Zhilin
Li, Jun
Li, Bin
Bai, Caijuan
Liu, Shiming
Wu, Tianyi
Xu, Shuhua
Qi, Xuebin
Su, Bing
collection PubMed
description Structural variants (SVs) may play important roles in human adaptation to extreme environments such as high altitude but have been under-investigated. Here, combining long-read sequencing with multiple scaffolding techniques, we assembled a high-quality Tibetan genome (ZF1), with a contig N50 length of 24.57 mega-base pairs (Mb) and a scaffold N50 length of 58.80 Mb. The ZF1 assembly filled 80 remaining N-gaps (0.25 Mb in total length) in the reference human genome (GRCh38). Markedly, we detected 17 900 SVs, among which the ZF1-specific SVs are enriched in GTPase activity that is required for activation of the hypoxic pathway. Further population analysis uncovered a 163-bp intronic deletion in the MKL1 gene showing large divergence between highland Tibetans and lowland Han Chinese. This deletion is significantly associated with lower systolic pulmonary arterial pressure, one of the key adaptive physiological traits in Tibetans. Moreover, with the use of the high-quality de novo assembly, we observed a much higher rate of genome-wide archaic hominid (Altai Neanderthal and Denisovan) shared non-reference sequences in ZF1 (1.32%–1.53%) compared to other East Asian genomes (0.70%–0.98%), reflecting a unique genomic composition of Tibetans. One such archaic hominid shared sequence—a 662-bp intronic insertion in the SCUBE2 gene—is enriched and associated with better lung function (the FEV1/FVC ratio) in Tibetans. Collectively, we generated the first high-resolution Tibetan reference genome, and the identified SVs may serve as valuable resources for future evolutionary and medical studies.
format Online
Article
Text
id pubmed-8288928
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-82889282021-10-21 De novo assembly of a Tibetan genome and identification of novel structural variants associated with high-altitude adaptation He, Yaoxi Lou, Haiyi Cui, Chaoying Deng, Lian Gao, Yang Zheng, Wangshan Guo, Yongbo Wang, Xiaoji Ning, Zhilin Li, Jun Li, Bin Bai, Caijuan Liu, Shiming Wu, Tianyi Xu, Shuhua Qi, Xuebin Su, Bing Natl Sci Rev Research Article Structural variants (SVs) may play important roles in human adaptation to extreme environments such as high altitude but have been under-investigated. Here, combining long-read sequencing with multiple scaffolding techniques, we assembled a high-quality Tibetan genome (ZF1), with a contig N50 length of 24.57 mega-base pairs (Mb) and a scaffold N50 length of 58.80 Mb. The ZF1 assembly filled 80 remaining N-gaps (0.25 Mb in total length) in the reference human genome (GRCh38). Markedly, we detected 17 900 SVs, among which the ZF1-specific SVs are enriched in GTPase activity that is required for activation of the hypoxic pathway. Further population analysis uncovered a 163-bp intronic deletion in the MKL1 gene showing large divergence between highland Tibetans and lowland Han Chinese. This deletion is significantly associated with lower systolic pulmonary arterial pressure, one of the key adaptive physiological traits in Tibetans. Moreover, with the use of the high-quality de novo assembly, we observed a much higher rate of genome-wide archaic hominid (Altai Neanderthal and Denisovan) shared non-reference sequences in ZF1 (1.32%–1.53%) compared to other East Asian genomes (0.70%–0.98%), reflecting a unique genomic composition of Tibetans. One such archaic hominid shared sequence—a 662-bp intronic insertion in the SCUBE2 gene—is enriched and associated with better lung function (the FEV1/FVC ratio) in Tibetans. Collectively, we generated the first high-resolution Tibetan reference genome, and the identified SVs may serve as valuable resources for future evolutionary and medical studies. Oxford University Press 2020-02 2019-10-23 /pmc/articles/PMC8288928/ /pubmed/34692055 http://dx.doi.org/10.1093/nsr/nwz160 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of China Science Publishing & Media Ltd. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
He, Yaoxi
Lou, Haiyi
Cui, Chaoying
Deng, Lian
Gao, Yang
Zheng, Wangshan
Guo, Yongbo
Wang, Xiaoji
Ning, Zhilin
Li, Jun
Li, Bin
Bai, Caijuan
Liu, Shiming
Wu, Tianyi
Xu, Shuhua
Qi, Xuebin
Su, Bing
De novo assembly of a Tibetan genome and identification of novel structural variants associated with high-altitude adaptation
title De novo assembly of a Tibetan genome and identification of novel structural variants associated with high-altitude adaptation
title_full De novo assembly of a Tibetan genome and identification of novel structural variants associated with high-altitude adaptation
title_fullStr De novo assembly of a Tibetan genome and identification of novel structural variants associated with high-altitude adaptation
title_full_unstemmed De novo assembly of a Tibetan genome and identification of novel structural variants associated with high-altitude adaptation
title_short De novo assembly of a Tibetan genome and identification of novel structural variants associated with high-altitude adaptation
title_sort de novo assembly of a tibetan genome and identification of novel structural variants associated with high-altitude adaptation
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8288928/
https://www.ncbi.nlm.nih.gov/pubmed/34692055
http://dx.doi.org/10.1093/nsr/nwz160
work_keys_str_mv AT denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT heyaoxi denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT louhaiyi denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT cuichaoying denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT denglian denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT gaoyang denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT zhengwangshan denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT guoyongbo denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT wangxiaoji denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT ningzhilin denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT lijun denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT libin denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT baicaijuan denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT liushiming denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT wutianyi denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT xushuhua denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT qixuebin denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation
AT subing denovoassemblyofatibetangenomeandidentificationofnovelstructuralvariantsassociatedwithhighaltitudeadaptation