Cargando…
Curated variation benchmarks for challenging medically-relevant autosomal genes
The repetitive nature and complexity of some medically relevant genes poses a challenge for their accurate analysis in a clinical setting. The Genome in a Bottle Consortium has provided variant benchmark sets, but these exclude nearly four hundred medically relevant genes due to their repetitiveness...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9117392/ https://www.ncbi.nlm.nih.gov/pubmed/35132260 http://dx.doi.org/10.1038/s41587-021-01158-1 |
_version_ | 1784710324717355008 |
---|---|
author | Wagner, Justin Olson, Nathan D Harris, Lindsay McDaniel, Jennifer Cheng, Haoyu Fungtammasan, Arkarachai Hwang, Yih-Chii Gupta, Richa Wenger, Aaron M Rowell, William J Khan, Ziad M Farek, Jesse Zhu, Yiming Pisupati, Aishwarya Mahmoud, Medhat Xiao, Chunlin Yoo, Byunggil Sahraeian, Sayed Mohammad Ebrahim Miller, Danny E. Jáspez, David Lorenzo-Salazar, José M. Muñoz-Barrera, Adrián Rubio-Rodríguez, Luis A. Flores, Carlos Narzisi, Giuseppe Evani, Uday Shanker Clarke, Wayne E. Lee, Joyce Mason, Christopher E. Lincoln, Stephen E. Miga, Karen H. Ebbert, Mark T. W. Shumate, Alaina Li, Heng Chin, Chen-Shan Zook, Justin M Sedlazeck, Fritz J |
author_facet | Wagner, Justin Olson, Nathan D Harris, Lindsay McDaniel, Jennifer Cheng, Haoyu Fungtammasan, Arkarachai Hwang, Yih-Chii Gupta, Richa Wenger, Aaron M Rowell, William J Khan, Ziad M Farek, Jesse Zhu, Yiming Pisupati, Aishwarya Mahmoud, Medhat Xiao, Chunlin Yoo, Byunggil Sahraeian, Sayed Mohammad Ebrahim Miller, Danny E. Jáspez, David Lorenzo-Salazar, José M. Muñoz-Barrera, Adrián Rubio-Rodríguez, Luis A. Flores, Carlos Narzisi, Giuseppe Evani, Uday Shanker Clarke, Wayne E. Lee, Joyce Mason, Christopher E. Lincoln, Stephen E. Miga, Karen H. Ebbert, Mark T. W. Shumate, Alaina Li, Heng Chin, Chen-Shan Zook, Justin M Sedlazeck, Fritz J |
author_sort | Wagner, Justin |
collection | PubMed |
description | The repetitive nature and complexity of some medically relevant genes poses a challenge for their accurate analysis in a clinical setting. The Genome in a Bottle Consortium has provided variant benchmark sets, but these exclude nearly four hundred medically relevant genes due to their repetitiveness or polymorphic complexity. Here we characterize 273 of these 395 challenging autosomal genes using a haplotype-resolved whole-genome assembly. This curated benchmark reports over 17,000 single nucleotide variations, 3,600 INDELs, and 200 structural variations each for human genome reference GRCh37 and GRCh38 across HG002. We show that false duplications in either GRCh37 or GRCh38 result in reference-specific, missed variants for short- and long-read technologies in medically relevant genes including CBS, CRYAA, and KCNE1. When masking these false duplications, variant recall can improve from 8% to 100%. Forming benchmarks from a haplotype-resolved whole-genome assembly may become a prototype for future benchmarks covering the whole genome. |
format | Online Article Text |
id | pubmed-9117392 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
record_format | MEDLINE/PubMed |
spelling | pubmed-91173922022-08-07 Curated variation benchmarks for challenging medically-relevant autosomal genes Wagner, Justin Olson, Nathan D Harris, Lindsay McDaniel, Jennifer Cheng, Haoyu Fungtammasan, Arkarachai Hwang, Yih-Chii Gupta, Richa Wenger, Aaron M Rowell, William J Khan, Ziad M Farek, Jesse Zhu, Yiming Pisupati, Aishwarya Mahmoud, Medhat Xiao, Chunlin Yoo, Byunggil Sahraeian, Sayed Mohammad Ebrahim Miller, Danny E. Jáspez, David Lorenzo-Salazar, José M. Muñoz-Barrera, Adrián Rubio-Rodríguez, Luis A. Flores, Carlos Narzisi, Giuseppe Evani, Uday Shanker Clarke, Wayne E. Lee, Joyce Mason, Christopher E. Lincoln, Stephen E. Miga, Karen H. Ebbert, Mark T. W. Shumate, Alaina Li, Heng Chin, Chen-Shan Zook, Justin M Sedlazeck, Fritz J Nat Biotechnol Article The repetitive nature and complexity of some medically relevant genes poses a challenge for their accurate analysis in a clinical setting. The Genome in a Bottle Consortium has provided variant benchmark sets, but these exclude nearly four hundred medically relevant genes due to their repetitiveness or polymorphic complexity. Here we characterize 273 of these 395 challenging autosomal genes using a haplotype-resolved whole-genome assembly. This curated benchmark reports over 17,000 single nucleotide variations, 3,600 INDELs, and 200 structural variations each for human genome reference GRCh37 and GRCh38 across HG002. We show that false duplications in either GRCh37 or GRCh38 result in reference-specific, missed variants for short- and long-read technologies in medically relevant genes including CBS, CRYAA, and KCNE1. When masking these false duplications, variant recall can improve from 8% to 100%. Forming benchmarks from a haplotype-resolved whole-genome assembly may become a prototype for future benchmarks covering the whole genome. 2022-05 2022-02-07 /pmc/articles/PMC9117392/ /pubmed/35132260 http://dx.doi.org/10.1038/s41587-021-01158-1 Text en Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: https://www.springernature.com/gp/open-research/policies/accepted-manuscript-terms |
spellingShingle | Article Wagner, Justin Olson, Nathan D Harris, Lindsay McDaniel, Jennifer Cheng, Haoyu Fungtammasan, Arkarachai Hwang, Yih-Chii Gupta, Richa Wenger, Aaron M Rowell, William J Khan, Ziad M Farek, Jesse Zhu, Yiming Pisupati, Aishwarya Mahmoud, Medhat Xiao, Chunlin Yoo, Byunggil Sahraeian, Sayed Mohammad Ebrahim Miller, Danny E. Jáspez, David Lorenzo-Salazar, José M. Muñoz-Barrera, Adrián Rubio-Rodríguez, Luis A. Flores, Carlos Narzisi, Giuseppe Evani, Uday Shanker Clarke, Wayne E. Lee, Joyce Mason, Christopher E. Lincoln, Stephen E. Miga, Karen H. Ebbert, Mark T. W. Shumate, Alaina Li, Heng Chin, Chen-Shan Zook, Justin M Sedlazeck, Fritz J Curated variation benchmarks for challenging medically-relevant autosomal genes |
title | Curated variation benchmarks for challenging medically-relevant autosomal genes |
title_full | Curated variation benchmarks for challenging medically-relevant autosomal genes |
title_fullStr | Curated variation benchmarks for challenging medically-relevant autosomal genes |
title_full_unstemmed | Curated variation benchmarks for challenging medically-relevant autosomal genes |
title_short | Curated variation benchmarks for challenging medically-relevant autosomal genes |
title_sort | curated variation benchmarks for challenging medically-relevant autosomal genes |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9117392/ https://www.ncbi.nlm.nih.gov/pubmed/35132260 http://dx.doi.org/10.1038/s41587-021-01158-1 |
work_keys_str_mv | AT wagnerjustin curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT olsonnathand curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT harrislindsay curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT mcdanieljennifer curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT chenghaoyu curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT fungtammasanarkarachai curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT hwangyihchii curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT guptaricha curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT wengeraaronm curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT rowellwilliamj curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT khanziadm curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT farekjesse curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT zhuyiming curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT pisupatiaishwarya curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT mahmoudmedhat curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT xiaochunlin curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT yoobyunggil curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT sahraeiansayedmohammadebrahim curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT millerdannye curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT jaspezdavid curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT lorenzosalazarjosem curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT munozbarreraadrian curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT rubiorodriguezluisa curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT florescarlos curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT narzisigiuseppe curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT evaniudayshanker curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT clarkewaynee curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT leejoyce curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT masonchristophere curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT lincolnstephene curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT migakarenh curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT ebbertmarktw curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT shumatealaina curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT liheng curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT chinchenshan curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT zookjustinm curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes AT sedlazeckfritzj curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes |