Cargando…

Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet

BACKGROUND: Recent state-of-the-art sequencing technologies enable the investigation of challenging regions in the human genome and expand the scope of variant benchmarking datasets. Herein, we sequence a Chinese Quartet, comprising two monozygotic twin daughters and their biological parents, using...

Descripción completa

Detalles Bibliográficos
Autores principales: Jia, Peng, Dong, Lianhua, Yang, Xiaofei, Wang, Bo, Bush, Stephen J., Wang, Tingjie, Lin, Jiadong, Wang, Songbo, Zhao, Xixi, Xu, Tun, Che, Yizhuo, Dang, Ningxin, Ren, Luyao, Zhang, Yujing, Wang, Xia, Liang, Fan, Wang, Yang, Ruan, Jue, Xia, Han, Zheng, Yuanting, Shi, Leming, Lv, Yi, Wang, Jing, Ye, Kai
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10694985/
http://dx.doi.org/10.1186/s13059-023-03116-3
_version_ 1785153495665475584
author Jia, Peng
Dong, Lianhua
Yang, Xiaofei
Wang, Bo
Bush, Stephen J.
Wang, Tingjie
Lin, Jiadong
Wang, Songbo
Zhao, Xixi
Xu, Tun
Che, Yizhuo
Dang, Ningxin
Ren, Luyao
Zhang, Yujing
Wang, Xia
Liang, Fan
Wang, Yang
Ruan, Jue
Xia, Han
Zheng, Yuanting
Shi, Leming
Lv, Yi
Wang, Jing
Ye, Kai
author_facet Jia, Peng
Dong, Lianhua
Yang, Xiaofei
Wang, Bo
Bush, Stephen J.
Wang, Tingjie
Lin, Jiadong
Wang, Songbo
Zhao, Xixi
Xu, Tun
Che, Yizhuo
Dang, Ningxin
Ren, Luyao
Zhang, Yujing
Wang, Xia
Liang, Fan
Wang, Yang
Ruan, Jue
Xia, Han
Zheng, Yuanting
Shi, Leming
Lv, Yi
Wang, Jing
Ye, Kai
author_sort Jia, Peng
collection PubMed
description BACKGROUND: Recent state-of-the-art sequencing technologies enable the investigation of challenging regions in the human genome and expand the scope of variant benchmarking datasets. Herein, we sequence a Chinese Quartet, comprising two monozygotic twin daughters and their biological parents, using four short and long sequencing platforms (Illumina, BGI, PacBio, and Oxford Nanopore Technology). RESULTS: The long reads from the monozygotic twin daughters are phased into paternal and maternal haplotypes using the parent–child genetic map and for each haplotype. We also use long reads to generate haplotype-resolved whole-genome assemblies with completeness and continuity exceeding that of GRCh38. Using this Quartet, we comprehensively catalogue the human variant landscape, generating a dataset of 3,962,453 SNVs, 886,648 indels (< 50 bp), 9726 large deletions (≥ 50 bp), 15,600 large insertions (≥ 50 bp), 40 inversions, 31 complex structural variants, and 68 de novo mutations which are shared between the monozygotic twin daughters. Variants underrepresented in previous benchmarks owing to their complexity—including those located at long repeat regions, complex structural variants, and de novo mutations—are systematically examined in this study. CONCLUSIONS: In summary, this study provides high-quality haplotype-resolved assemblies and a comprehensive set of benchmarking resources for two Chinese monozygotic twin samples which, relative to existing benchmarks, offers expanded genomic coverage and insight into complex variant categories. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-03116-3.
format Online
Article
Text
id pubmed-10694985
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-106949852023-12-05 Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet Jia, Peng Dong, Lianhua Yang, Xiaofei Wang, Bo Bush, Stephen J. Wang, Tingjie Lin, Jiadong Wang, Songbo Zhao, Xixi Xu, Tun Che, Yizhuo Dang, Ningxin Ren, Luyao Zhang, Yujing Wang, Xia Liang, Fan Wang, Yang Ruan, Jue Xia, Han Zheng, Yuanting Shi, Leming Lv, Yi Wang, Jing Ye, Kai Genome Biol Research BACKGROUND: Recent state-of-the-art sequencing technologies enable the investigation of challenging regions in the human genome and expand the scope of variant benchmarking datasets. Herein, we sequence a Chinese Quartet, comprising two monozygotic twin daughters and their biological parents, using four short and long sequencing platforms (Illumina, BGI, PacBio, and Oxford Nanopore Technology). RESULTS: The long reads from the monozygotic twin daughters are phased into paternal and maternal haplotypes using the parent–child genetic map and for each haplotype. We also use long reads to generate haplotype-resolved whole-genome assemblies with completeness and continuity exceeding that of GRCh38. Using this Quartet, we comprehensively catalogue the human variant landscape, generating a dataset of 3,962,453 SNVs, 886,648 indels (< 50 bp), 9726 large deletions (≥ 50 bp), 15,600 large insertions (≥ 50 bp), 40 inversions, 31 complex structural variants, and 68 de novo mutations which are shared between the monozygotic twin daughters. Variants underrepresented in previous benchmarks owing to their complexity—including those located at long repeat regions, complex structural variants, and de novo mutations—are systematically examined in this study. CONCLUSIONS: In summary, this study provides high-quality haplotype-resolved assemblies and a comprehensive set of benchmarking resources for two Chinese monozygotic twin samples which, relative to existing benchmarks, offers expanded genomic coverage and insight into complex variant categories. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-03116-3. BioMed Central 2023-12-04 /pmc/articles/PMC10694985/ http://dx.doi.org/10.1186/s13059-023-03116-3 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Jia, Peng
Dong, Lianhua
Yang, Xiaofei
Wang, Bo
Bush, Stephen J.
Wang, Tingjie
Lin, Jiadong
Wang, Songbo
Zhao, Xixi
Xu, Tun
Che, Yizhuo
Dang, Ningxin
Ren, Luyao
Zhang, Yujing
Wang, Xia
Liang, Fan
Wang, Yang
Ruan, Jue
Xia, Han
Zheng, Yuanting
Shi, Leming
Lv, Yi
Wang, Jing
Ye, Kai
Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet
title Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet
title_full Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet
title_fullStr Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet
title_full_unstemmed Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet
title_short Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet
title_sort haplotype-resolved assemblies and variant benchmark of a chinese quartet
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10694985/
http://dx.doi.org/10.1186/s13059-023-03116-3
work_keys_str_mv AT jiapeng haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT donglianhua haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT yangxiaofei haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT wangbo haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT bushstephenj haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT wangtingjie haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT linjiadong haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT wangsongbo haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT zhaoxixi haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT xutun haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT cheyizhuo haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT dangningxin haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT renluyao haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT zhangyujing haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT wangxia haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT liangfan haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT wangyang haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT ruanjue haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT xiahan haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT zhengyuanting haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT shileming haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT lvyi haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT wangjing haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet
AT yekai haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet