Cargando…
Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet
BACKGROUND: Recent state-of-the-art sequencing technologies enable the investigation of challenging regions in the human genome and expand the scope of variant benchmarking datasets. Herein, we sequence a Chinese Quartet, comprising two monozygotic twin daughters and their biological parents, using...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10694985/ http://dx.doi.org/10.1186/s13059-023-03116-3 |
_version_ | 1785153495665475584 |
---|---|
author | Jia, Peng Dong, Lianhua Yang, Xiaofei Wang, Bo Bush, Stephen J. Wang, Tingjie Lin, Jiadong Wang, Songbo Zhao, Xixi Xu, Tun Che, Yizhuo Dang, Ningxin Ren, Luyao Zhang, Yujing Wang, Xia Liang, Fan Wang, Yang Ruan, Jue Xia, Han Zheng, Yuanting Shi, Leming Lv, Yi Wang, Jing Ye, Kai |
author_facet | Jia, Peng Dong, Lianhua Yang, Xiaofei Wang, Bo Bush, Stephen J. Wang, Tingjie Lin, Jiadong Wang, Songbo Zhao, Xixi Xu, Tun Che, Yizhuo Dang, Ningxin Ren, Luyao Zhang, Yujing Wang, Xia Liang, Fan Wang, Yang Ruan, Jue Xia, Han Zheng, Yuanting Shi, Leming Lv, Yi Wang, Jing Ye, Kai |
author_sort | Jia, Peng |
collection | PubMed |
description | BACKGROUND: Recent state-of-the-art sequencing technologies enable the investigation of challenging regions in the human genome and expand the scope of variant benchmarking datasets. Herein, we sequence a Chinese Quartet, comprising two monozygotic twin daughters and their biological parents, using four short and long sequencing platforms (Illumina, BGI, PacBio, and Oxford Nanopore Technology). RESULTS: The long reads from the monozygotic twin daughters are phased into paternal and maternal haplotypes using the parent–child genetic map and for each haplotype. We also use long reads to generate haplotype-resolved whole-genome assemblies with completeness and continuity exceeding that of GRCh38. Using this Quartet, we comprehensively catalogue the human variant landscape, generating a dataset of 3,962,453 SNVs, 886,648 indels (< 50 bp), 9726 large deletions (≥ 50 bp), 15,600 large insertions (≥ 50 bp), 40 inversions, 31 complex structural variants, and 68 de novo mutations which are shared between the monozygotic twin daughters. Variants underrepresented in previous benchmarks owing to their complexity—including those located at long repeat regions, complex structural variants, and de novo mutations—are systematically examined in this study. CONCLUSIONS: In summary, this study provides high-quality haplotype-resolved assemblies and a comprehensive set of benchmarking resources for two Chinese monozygotic twin samples which, relative to existing benchmarks, offers expanded genomic coverage and insight into complex variant categories. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-03116-3. |
format | Online Article Text |
id | pubmed-10694985 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-106949852023-12-05 Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet Jia, Peng Dong, Lianhua Yang, Xiaofei Wang, Bo Bush, Stephen J. Wang, Tingjie Lin, Jiadong Wang, Songbo Zhao, Xixi Xu, Tun Che, Yizhuo Dang, Ningxin Ren, Luyao Zhang, Yujing Wang, Xia Liang, Fan Wang, Yang Ruan, Jue Xia, Han Zheng, Yuanting Shi, Leming Lv, Yi Wang, Jing Ye, Kai Genome Biol Research BACKGROUND: Recent state-of-the-art sequencing technologies enable the investigation of challenging regions in the human genome and expand the scope of variant benchmarking datasets. Herein, we sequence a Chinese Quartet, comprising two monozygotic twin daughters and their biological parents, using four short and long sequencing platforms (Illumina, BGI, PacBio, and Oxford Nanopore Technology). RESULTS: The long reads from the monozygotic twin daughters are phased into paternal and maternal haplotypes using the parent–child genetic map and for each haplotype. We also use long reads to generate haplotype-resolved whole-genome assemblies with completeness and continuity exceeding that of GRCh38. Using this Quartet, we comprehensively catalogue the human variant landscape, generating a dataset of 3,962,453 SNVs, 886,648 indels (< 50 bp), 9726 large deletions (≥ 50 bp), 15,600 large insertions (≥ 50 bp), 40 inversions, 31 complex structural variants, and 68 de novo mutations which are shared between the monozygotic twin daughters. Variants underrepresented in previous benchmarks owing to their complexity—including those located at long repeat regions, complex structural variants, and de novo mutations—are systematically examined in this study. CONCLUSIONS: In summary, this study provides high-quality haplotype-resolved assemblies and a comprehensive set of benchmarking resources for two Chinese monozygotic twin samples which, relative to existing benchmarks, offers expanded genomic coverage and insight into complex variant categories. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-03116-3. BioMed Central 2023-12-04 /pmc/articles/PMC10694985/ http://dx.doi.org/10.1186/s13059-023-03116-3 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Jia, Peng Dong, Lianhua Yang, Xiaofei Wang, Bo Bush, Stephen J. Wang, Tingjie Lin, Jiadong Wang, Songbo Zhao, Xixi Xu, Tun Che, Yizhuo Dang, Ningxin Ren, Luyao Zhang, Yujing Wang, Xia Liang, Fan Wang, Yang Ruan, Jue Xia, Han Zheng, Yuanting Shi, Leming Lv, Yi Wang, Jing Ye, Kai Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet |
title | Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet |
title_full | Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet |
title_fullStr | Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet |
title_full_unstemmed | Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet |
title_short | Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet |
title_sort | haplotype-resolved assemblies and variant benchmark of a chinese quartet |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10694985/ http://dx.doi.org/10.1186/s13059-023-03116-3 |
work_keys_str_mv | AT jiapeng haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT donglianhua haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT yangxiaofei haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT wangbo haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT bushstephenj haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT wangtingjie haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT linjiadong haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT wangsongbo haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT zhaoxixi haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT xutun haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT cheyizhuo haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT dangningxin haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT renluyao haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT zhangyujing haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT wangxia haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT liangfan haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT wangyang haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT ruanjue haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT xiahan haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT zhengyuanting haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT shileming haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT lvyi haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT wangjing haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet AT yekai haplotyperesolvedassembliesandvariantbenchmarkofachinesequartet |