Cargando…
_version_ 1784840534527836160
author Wagner, Justin
Olson, Nathan D.
Harris, Lindsay
Khan, Ziad
Farek, Jesse
Mahmoud, Medhat
Stankovic, Ana
Kovacevic, Vladimir
Yoo, Byunggil
Miller, Neil
Rosenfeld, Jeffrey A.
Ni, Bohan
Zarate, Samantha
Kirsche, Melanie
Aganezov, Sergey
Schatz, Michael C.
Narzisi, Giuseppe
Byrska-Bishop, Marta
Clarke, Wayne
Evani, Uday S.
Markello, Charles
Shafin, Kishwar
Zhou, Xin
Sidow, Arend
Bansal, Vikas
Ebert, Peter
Marschall, Tobias
Lansdorp, Peter
Hanlon, Vincent
Mattsson, Carl-Adam
Barrio, Alvaro Martinez
Fiddes, Ian T.
Xiao, Chunlin
Fungtammasan, Arkarachai
Chin, Chen-Shan
Wenger, Aaron M.
Rowell, William J.
Sedlazeck, Fritz J.
Carroll, Andrew
Salit, Marc
Zook, Justin M.
author_facet Wagner, Justin
Olson, Nathan D.
Harris, Lindsay
Khan, Ziad
Farek, Jesse
Mahmoud, Medhat
Stankovic, Ana
Kovacevic, Vladimir
Yoo, Byunggil
Miller, Neil
Rosenfeld, Jeffrey A.
Ni, Bohan
Zarate, Samantha
Kirsche, Melanie
Aganezov, Sergey
Schatz, Michael C.
Narzisi, Giuseppe
Byrska-Bishop, Marta
Clarke, Wayne
Evani, Uday S.
Markello, Charles
Shafin, Kishwar
Zhou, Xin
Sidow, Arend
Bansal, Vikas
Ebert, Peter
Marschall, Tobias
Lansdorp, Peter
Hanlon, Vincent
Mattsson, Carl-Adam
Barrio, Alvaro Martinez
Fiddes, Ian T.
Xiao, Chunlin
Fungtammasan, Arkarachai
Chin, Chen-Shan
Wenger, Aaron M.
Rowell, William J.
Sedlazeck, Fritz J.
Carroll, Andrew
Salit, Marc
Zook, Justin M.
author_sort Wagner, Justin
collection PubMed
description Genome in a Bottle benchmarks are widely used to help validate clinical sequencing pipelines and develop variant calling and sequencing methods. Here we use accurate linked and long reads to expand benchmarks in 7 samples to include difficult-to-map regions and segmental duplications that are challenging for short reads. These benchmarks add more than 300,000 SNVs and 50,000 insertions or deletions (indels) and include 16% more exonic variants, many in challenging, clinically relevant genes not covered previously, such as PMS2. For HG002, we include 92% of the autosomal GRCh38 assembly while excluding regions problematic for benchmarking small variants, such as copy number variants, that should not have been in the previous version, which included 85% of GRCh38. It identifies eight times more false negatives in a short read variant call set relative to our previous benchmark. We demonstrate that this benchmark reliably identifies false positives and false negatives across technologies, enabling ongoing methods development.
format Online
Article
Text
id pubmed-9706577
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-97065772022-11-29 Benchmarking challenging small variants with linked and long reads Wagner, Justin Olson, Nathan D. Harris, Lindsay Khan, Ziad Farek, Jesse Mahmoud, Medhat Stankovic, Ana Kovacevic, Vladimir Yoo, Byunggil Miller, Neil Rosenfeld, Jeffrey A. Ni, Bohan Zarate, Samantha Kirsche, Melanie Aganezov, Sergey Schatz, Michael C. Narzisi, Giuseppe Byrska-Bishop, Marta Clarke, Wayne Evani, Uday S. Markello, Charles Shafin, Kishwar Zhou, Xin Sidow, Arend Bansal, Vikas Ebert, Peter Marschall, Tobias Lansdorp, Peter Hanlon, Vincent Mattsson, Carl-Adam Barrio, Alvaro Martinez Fiddes, Ian T. Xiao, Chunlin Fungtammasan, Arkarachai Chin, Chen-Shan Wenger, Aaron M. Rowell, William J. Sedlazeck, Fritz J. Carroll, Andrew Salit, Marc Zook, Justin M. Cell Genom Article Genome in a Bottle benchmarks are widely used to help validate clinical sequencing pipelines and develop variant calling and sequencing methods. Here we use accurate linked and long reads to expand benchmarks in 7 samples to include difficult-to-map regions and segmental duplications that are challenging for short reads. These benchmarks add more than 300,000 SNVs and 50,000 insertions or deletions (indels) and include 16% more exonic variants, many in challenging, clinically relevant genes not covered previously, such as PMS2. For HG002, we include 92% of the autosomal GRCh38 assembly while excluding regions problematic for benchmarking small variants, such as copy number variants, that should not have been in the previous version, which included 85% of GRCh38. It identifies eight times more false negatives in a short read variant call set relative to our previous benchmark. We demonstrate that this benchmark reliably identifies false positives and false negatives across technologies, enabling ongoing methods development. Elsevier 2022-04-28 /pmc/articles/PMC9706577/ /pubmed/36452119 http://dx.doi.org/10.1016/j.xgen.2022.100128 Text en https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Wagner, Justin
Olson, Nathan D.
Harris, Lindsay
Khan, Ziad
Farek, Jesse
Mahmoud, Medhat
Stankovic, Ana
Kovacevic, Vladimir
Yoo, Byunggil
Miller, Neil
Rosenfeld, Jeffrey A.
Ni, Bohan
Zarate, Samantha
Kirsche, Melanie
Aganezov, Sergey
Schatz, Michael C.
Narzisi, Giuseppe
Byrska-Bishop, Marta
Clarke, Wayne
Evani, Uday S.
Markello, Charles
Shafin, Kishwar
Zhou, Xin
Sidow, Arend
Bansal, Vikas
Ebert, Peter
Marschall, Tobias
Lansdorp, Peter
Hanlon, Vincent
Mattsson, Carl-Adam
Barrio, Alvaro Martinez
Fiddes, Ian T.
Xiao, Chunlin
Fungtammasan, Arkarachai
Chin, Chen-Shan
Wenger, Aaron M.
Rowell, William J.
Sedlazeck, Fritz J.
Carroll, Andrew
Salit, Marc
Zook, Justin M.
Benchmarking challenging small variants with linked and long reads
title Benchmarking challenging small variants with linked and long reads
title_full Benchmarking challenging small variants with linked and long reads
title_fullStr Benchmarking challenging small variants with linked and long reads
title_full_unstemmed Benchmarking challenging small variants with linked and long reads
title_short Benchmarking challenging small variants with linked and long reads
title_sort benchmarking challenging small variants with linked and long reads
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9706577/
https://www.ncbi.nlm.nih.gov/pubmed/36452119
http://dx.doi.org/10.1016/j.xgen.2022.100128
work_keys_str_mv AT wagnerjustin benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT olsonnathand benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT harrislindsay benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT khanziad benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT farekjesse benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT mahmoudmedhat benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT stankovicana benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT kovacevicvladimir benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT yoobyunggil benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT millerneil benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT rosenfeldjeffreya benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT nibohan benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT zaratesamantha benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT kirschemelanie benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT aganezovsergey benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT schatzmichaelc benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT narzisigiuseppe benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT byrskabishopmarta benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT clarkewayne benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT evaniudays benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT markellocharles benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT shafinkishwar benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT zhouxin benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT sidowarend benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT bansalvikas benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT ebertpeter benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT marschalltobias benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT lansdorppeter benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT hanlonvincent benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT mattssoncarladam benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT barrioalvaromartinez benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT fiddesiant benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT xiaochunlin benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT fungtammasanarkarachai benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT chinchenshan benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT wengeraaronm benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT rowellwilliamj benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT sedlazeckfritzj benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT carrollandrew benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT salitmarc benchmarkingchallengingsmallvariantswithlinkedandlongreads
AT zookjustinm benchmarkingchallengingsmallvariantswithlinkedandlongreads