Cargando…

Curated variation benchmarks for challenging medically-relevant autosomal genes

The repetitive nature and complexity of some medically relevant genes poses a challenge for their accurate analysis in a clinical setting. The Genome in a Bottle Consortium has provided variant benchmark sets, but these exclude nearly four hundred medically relevant genes due to their repetitiveness...

Descripción completa

Detalles Bibliográficos
Autores principales: Wagner, Justin, Olson, Nathan D, Harris, Lindsay, McDaniel, Jennifer, Cheng, Haoyu, Fungtammasan, Arkarachai, Hwang, Yih-Chii, Gupta, Richa, Wenger, Aaron M, Rowell, William J, Khan, Ziad M, Farek, Jesse, Zhu, Yiming, Pisupati, Aishwarya, Mahmoud, Medhat, Xiao, Chunlin, Yoo, Byunggil, Sahraeian, Sayed Mohammad Ebrahim, Miller, Danny E., Jáspez, David, Lorenzo-Salazar, José M., Muñoz-Barrera, Adrián, Rubio-Rodríguez, Luis A., Flores, Carlos, Narzisi, Giuseppe, Evani, Uday Shanker, Clarke, Wayne E., Lee, Joyce, Mason, Christopher E., Lincoln, Stephen E., Miga, Karen H., Ebbert, Mark T. W., Shumate, Alaina, Li, Heng, Chin, Chen-Shan, Zook, Justin M, Sedlazeck, Fritz J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9117392/
https://www.ncbi.nlm.nih.gov/pubmed/35132260
http://dx.doi.org/10.1038/s41587-021-01158-1
_version_ 1784710324717355008
author Wagner, Justin
Olson, Nathan D
Harris, Lindsay
McDaniel, Jennifer
Cheng, Haoyu
Fungtammasan, Arkarachai
Hwang, Yih-Chii
Gupta, Richa
Wenger, Aaron M
Rowell, William J
Khan, Ziad M
Farek, Jesse
Zhu, Yiming
Pisupati, Aishwarya
Mahmoud, Medhat
Xiao, Chunlin
Yoo, Byunggil
Sahraeian, Sayed Mohammad Ebrahim
Miller, Danny E.
Jáspez, David
Lorenzo-Salazar, José M.
Muñoz-Barrera, Adrián
Rubio-Rodríguez, Luis A.
Flores, Carlos
Narzisi, Giuseppe
Evani, Uday Shanker
Clarke, Wayne E.
Lee, Joyce
Mason, Christopher E.
Lincoln, Stephen E.
Miga, Karen H.
Ebbert, Mark T. W.
Shumate, Alaina
Li, Heng
Chin, Chen-Shan
Zook, Justin M
Sedlazeck, Fritz J
author_facet Wagner, Justin
Olson, Nathan D
Harris, Lindsay
McDaniel, Jennifer
Cheng, Haoyu
Fungtammasan, Arkarachai
Hwang, Yih-Chii
Gupta, Richa
Wenger, Aaron M
Rowell, William J
Khan, Ziad M
Farek, Jesse
Zhu, Yiming
Pisupati, Aishwarya
Mahmoud, Medhat
Xiao, Chunlin
Yoo, Byunggil
Sahraeian, Sayed Mohammad Ebrahim
Miller, Danny E.
Jáspez, David
Lorenzo-Salazar, José M.
Muñoz-Barrera, Adrián
Rubio-Rodríguez, Luis A.
Flores, Carlos
Narzisi, Giuseppe
Evani, Uday Shanker
Clarke, Wayne E.
Lee, Joyce
Mason, Christopher E.
Lincoln, Stephen E.
Miga, Karen H.
Ebbert, Mark T. W.
Shumate, Alaina
Li, Heng
Chin, Chen-Shan
Zook, Justin M
Sedlazeck, Fritz J
author_sort Wagner, Justin
collection PubMed
description The repetitive nature and complexity of some medically relevant genes poses a challenge for their accurate analysis in a clinical setting. The Genome in a Bottle Consortium has provided variant benchmark sets, but these exclude nearly four hundred medically relevant genes due to their repetitiveness or polymorphic complexity. Here we characterize 273 of these 395 challenging autosomal genes using a haplotype-resolved whole-genome assembly. This curated benchmark reports over 17,000 single nucleotide variations, 3,600 INDELs, and 200 structural variations each for human genome reference GRCh37 and GRCh38 across HG002. We show that false duplications in either GRCh37 or GRCh38 result in reference-specific, missed variants for short- and long-read technologies in medically relevant genes including CBS, CRYAA, and KCNE1. When masking these false duplications, variant recall can improve from 8% to 100%. Forming benchmarks from a haplotype-resolved whole-genome assembly may become a prototype for future benchmarks covering the whole genome.
format Online
Article
Text
id pubmed-9117392
institution National Center for Biotechnology Information
language English
publishDate 2022
record_format MEDLINE/PubMed
spelling pubmed-91173922022-08-07 Curated variation benchmarks for challenging medically-relevant autosomal genes Wagner, Justin Olson, Nathan D Harris, Lindsay McDaniel, Jennifer Cheng, Haoyu Fungtammasan, Arkarachai Hwang, Yih-Chii Gupta, Richa Wenger, Aaron M Rowell, William J Khan, Ziad M Farek, Jesse Zhu, Yiming Pisupati, Aishwarya Mahmoud, Medhat Xiao, Chunlin Yoo, Byunggil Sahraeian, Sayed Mohammad Ebrahim Miller, Danny E. Jáspez, David Lorenzo-Salazar, José M. Muñoz-Barrera, Adrián Rubio-Rodríguez, Luis A. Flores, Carlos Narzisi, Giuseppe Evani, Uday Shanker Clarke, Wayne E. Lee, Joyce Mason, Christopher E. Lincoln, Stephen E. Miga, Karen H. Ebbert, Mark T. W. Shumate, Alaina Li, Heng Chin, Chen-Shan Zook, Justin M Sedlazeck, Fritz J Nat Biotechnol Article The repetitive nature and complexity of some medically relevant genes poses a challenge for their accurate analysis in a clinical setting. The Genome in a Bottle Consortium has provided variant benchmark sets, but these exclude nearly four hundred medically relevant genes due to their repetitiveness or polymorphic complexity. Here we characterize 273 of these 395 challenging autosomal genes using a haplotype-resolved whole-genome assembly. This curated benchmark reports over 17,000 single nucleotide variations, 3,600 INDELs, and 200 structural variations each for human genome reference GRCh37 and GRCh38 across HG002. We show that false duplications in either GRCh37 or GRCh38 result in reference-specific, missed variants for short- and long-read technologies in medically relevant genes including CBS, CRYAA, and KCNE1. When masking these false duplications, variant recall can improve from 8% to 100%. Forming benchmarks from a haplotype-resolved whole-genome assembly may become a prototype for future benchmarks covering the whole genome. 2022-05 2022-02-07 /pmc/articles/PMC9117392/ /pubmed/35132260 http://dx.doi.org/10.1038/s41587-021-01158-1 Text en Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: https://www.springernature.com/gp/open-research/policies/accepted-manuscript-terms
spellingShingle Article
Wagner, Justin
Olson, Nathan D
Harris, Lindsay
McDaniel, Jennifer
Cheng, Haoyu
Fungtammasan, Arkarachai
Hwang, Yih-Chii
Gupta, Richa
Wenger, Aaron M
Rowell, William J
Khan, Ziad M
Farek, Jesse
Zhu, Yiming
Pisupati, Aishwarya
Mahmoud, Medhat
Xiao, Chunlin
Yoo, Byunggil
Sahraeian, Sayed Mohammad Ebrahim
Miller, Danny E.
Jáspez, David
Lorenzo-Salazar, José M.
Muñoz-Barrera, Adrián
Rubio-Rodríguez, Luis A.
Flores, Carlos
Narzisi, Giuseppe
Evani, Uday Shanker
Clarke, Wayne E.
Lee, Joyce
Mason, Christopher E.
Lincoln, Stephen E.
Miga, Karen H.
Ebbert, Mark T. W.
Shumate, Alaina
Li, Heng
Chin, Chen-Shan
Zook, Justin M
Sedlazeck, Fritz J
Curated variation benchmarks for challenging medically-relevant autosomal genes
title Curated variation benchmarks for challenging medically-relevant autosomal genes
title_full Curated variation benchmarks for challenging medically-relevant autosomal genes
title_fullStr Curated variation benchmarks for challenging medically-relevant autosomal genes
title_full_unstemmed Curated variation benchmarks for challenging medically-relevant autosomal genes
title_short Curated variation benchmarks for challenging medically-relevant autosomal genes
title_sort curated variation benchmarks for challenging medically-relevant autosomal genes
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9117392/
https://www.ncbi.nlm.nih.gov/pubmed/35132260
http://dx.doi.org/10.1038/s41587-021-01158-1
work_keys_str_mv AT wagnerjustin curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT olsonnathand curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT harrislindsay curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT mcdanieljennifer curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT chenghaoyu curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT fungtammasanarkarachai curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT hwangyihchii curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT guptaricha curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT wengeraaronm curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT rowellwilliamj curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT khanziadm curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT farekjesse curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT zhuyiming curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT pisupatiaishwarya curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT mahmoudmedhat curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT xiaochunlin curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT yoobyunggil curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT sahraeiansayedmohammadebrahim curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT millerdannye curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT jaspezdavid curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT lorenzosalazarjosem curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT munozbarreraadrian curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT rubiorodriguezluisa curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT florescarlos curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT narzisigiuseppe curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT evaniudayshanker curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT clarkewaynee curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT leejoyce curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT masonchristophere curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT lincolnstephene curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT migakarenh curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT ebbertmarktw curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT shumatealaina curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT liheng curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT chinchenshan curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT zookjustinm curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes
AT sedlazeckfritzj curatedvariationbenchmarksforchallengingmedicallyrelevantautosomalgenes