Cargando…

Towards increased accuracy and reproducibility in SARS-CoV-2 next generation sequence analysis for public health surveillance

During the COVID-19 pandemic, SARS-CoV-2 surveillance efforts integrated genome sequencing of clinical samples to identify emergent viral variants and to support rapid experimental examination of genome-informed vaccine and therapeutic designs. Given the broad range of methods applied to generate ne...

Descripción completa

Detalles Bibliográficos
Autores principales: Connor, Ryan, Yarmosh, David A., Maier, Wolfgang, Shakya, Migun, Martin, Ross, Bradford, Rebecca, Brister, J. Rodney, Chain, Patrick SG, Copeland, Courtney A., di Iulio, Julia, Hu, Bin, Ebert, Philip, Gunti, Jonathan, Jin, Yumi, Katz, Kenneth S., Kochergin, Andrey, LaRosa, Tré, Li, Jiani, Li, Po-E, Lo, Chien-Chi, Rashid, Sujatha, Maiorova, Evguenia S., Xiao, Chunlin, Zalunin, Vadim, Pruitt, Kim D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9645426/
https://www.ncbi.nlm.nih.gov/pubmed/36380755
http://dx.doi.org/10.1101/2022.11.03.515010
_version_ 1784826963843612672
author Connor, Ryan
Yarmosh, David A.
Maier, Wolfgang
Shakya, Migun
Martin, Ross
Bradford, Rebecca
Brister, J. Rodney
Chain, Patrick SG
Copeland, Courtney A.
di Iulio, Julia
Hu, Bin
Ebert, Philip
Gunti, Jonathan
Jin, Yumi
Katz, Kenneth S.
Kochergin, Andrey
LaRosa, Tré
Li, Jiani
Li, Po-E
Lo, Chien-Chi
Rashid, Sujatha
Maiorova, Evguenia S.
Xiao, Chunlin
Zalunin, Vadim
Pruitt, Kim D.
author_facet Connor, Ryan
Yarmosh, David A.
Maier, Wolfgang
Shakya, Migun
Martin, Ross
Bradford, Rebecca
Brister, J. Rodney
Chain, Patrick SG
Copeland, Courtney A.
di Iulio, Julia
Hu, Bin
Ebert, Philip
Gunti, Jonathan
Jin, Yumi
Katz, Kenneth S.
Kochergin, Andrey
LaRosa, Tré
Li, Jiani
Li, Po-E
Lo, Chien-Chi
Rashid, Sujatha
Maiorova, Evguenia S.
Xiao, Chunlin
Zalunin, Vadim
Pruitt, Kim D.
author_sort Connor, Ryan
collection PubMed
description During the COVID-19 pandemic, SARS-CoV-2 surveillance efforts integrated genome sequencing of clinical samples to identify emergent viral variants and to support rapid experimental examination of genome-informed vaccine and therapeutic designs. Given the broad range of methods applied to generate new viral genomes, it is critical that consensus and variant calling tools yield consistent results across disparate pipelines. Here we examine the impact of sequencing technologies (Illumina and Oxford Nanopore) and 7 different downstream bioinformatic protocols on SARS-CoV-2 variant calling as part of the NIH Accelerating COVID-19 Therapeutic Interventions and Vaccines (ACTIV) Tracking Resistance and Coronavirus Evolution (TRACE) initiative, a public-private partnership established to address the COVID-19 outbreak. Our results indicate that bioinformatic workflows can yield consensus genomes with different single nucleotide polymorphisms, insertions, and/or deletions even when using the same raw sequence input datasets. We introduce the use of a specific suite of parameters and protocols that greatly improves the agreement among pipelines developed by diverse organizations. Such consistency among bioinformatic pipelines is fundamental to SARS-CoV-2 and future pathogen surveillance efforts. The application of analysis standards is necessary to more accurately document phylogenomic trends and support data-driven public health responses.
format Online
Article
Text
id pubmed-9645426
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-96454262022-11-15 Towards increased accuracy and reproducibility in SARS-CoV-2 next generation sequence analysis for public health surveillance Connor, Ryan Yarmosh, David A. Maier, Wolfgang Shakya, Migun Martin, Ross Bradford, Rebecca Brister, J. Rodney Chain, Patrick SG Copeland, Courtney A. di Iulio, Julia Hu, Bin Ebert, Philip Gunti, Jonathan Jin, Yumi Katz, Kenneth S. Kochergin, Andrey LaRosa, Tré Li, Jiani Li, Po-E Lo, Chien-Chi Rashid, Sujatha Maiorova, Evguenia S. Xiao, Chunlin Zalunin, Vadim Pruitt, Kim D. bioRxiv Article During the COVID-19 pandemic, SARS-CoV-2 surveillance efforts integrated genome sequencing of clinical samples to identify emergent viral variants and to support rapid experimental examination of genome-informed vaccine and therapeutic designs. Given the broad range of methods applied to generate new viral genomes, it is critical that consensus and variant calling tools yield consistent results across disparate pipelines. Here we examine the impact of sequencing technologies (Illumina and Oxford Nanopore) and 7 different downstream bioinformatic protocols on SARS-CoV-2 variant calling as part of the NIH Accelerating COVID-19 Therapeutic Interventions and Vaccines (ACTIV) Tracking Resistance and Coronavirus Evolution (TRACE) initiative, a public-private partnership established to address the COVID-19 outbreak. Our results indicate that bioinformatic workflows can yield consensus genomes with different single nucleotide polymorphisms, insertions, and/or deletions even when using the same raw sequence input datasets. We introduce the use of a specific suite of parameters and protocols that greatly improves the agreement among pipelines developed by diverse organizations. Such consistency among bioinformatic pipelines is fundamental to SARS-CoV-2 and future pathogen surveillance efforts. The application of analysis standards is necessary to more accurately document phylogenomic trends and support data-driven public health responses. Cold Spring Harbor Laboratory 2022-11-03 /pmc/articles/PMC9645426/ /pubmed/36380755 http://dx.doi.org/10.1101/2022.11.03.515010 Text en https://creativecommons.org/publicdomain/zero/1.0/This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license (https://creativecommons.org/publicdomain/zero/1.0/) .
spellingShingle Article
Connor, Ryan
Yarmosh, David A.
Maier, Wolfgang
Shakya, Migun
Martin, Ross
Bradford, Rebecca
Brister, J. Rodney
Chain, Patrick SG
Copeland, Courtney A.
di Iulio, Julia
Hu, Bin
Ebert, Philip
Gunti, Jonathan
Jin, Yumi
Katz, Kenneth S.
Kochergin, Andrey
LaRosa, Tré
Li, Jiani
Li, Po-E
Lo, Chien-Chi
Rashid, Sujatha
Maiorova, Evguenia S.
Xiao, Chunlin
Zalunin, Vadim
Pruitt, Kim D.
Towards increased accuracy and reproducibility in SARS-CoV-2 next generation sequence analysis for public health surveillance
title Towards increased accuracy and reproducibility in SARS-CoV-2 next generation sequence analysis for public health surveillance
title_full Towards increased accuracy and reproducibility in SARS-CoV-2 next generation sequence analysis for public health surveillance
title_fullStr Towards increased accuracy and reproducibility in SARS-CoV-2 next generation sequence analysis for public health surveillance
title_full_unstemmed Towards increased accuracy and reproducibility in SARS-CoV-2 next generation sequence analysis for public health surveillance
title_short Towards increased accuracy and reproducibility in SARS-CoV-2 next generation sequence analysis for public health surveillance
title_sort towards increased accuracy and reproducibility in sars-cov-2 next generation sequence analysis for public health surveillance
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9645426/
https://www.ncbi.nlm.nih.gov/pubmed/36380755
http://dx.doi.org/10.1101/2022.11.03.515010
work_keys_str_mv AT connorryan towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT yarmoshdavida towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT maierwolfgang towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT shakyamigun towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT martinross towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT bradfordrebecca towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT bristerjrodney towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT chainpatricksg towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT copelandcourtneya towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT diiuliojulia towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT hubin towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT ebertphilip towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT guntijonathan towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT jinyumi towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT katzkenneths towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT kocherginandrey towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT larosatre towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT lijiani towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT lipoe towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT lochienchi towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT rashidsujatha towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT maiorovaevguenias towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT xiaochunlin towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT zaluninvadim towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance
AT pruittkimd towardsincreasedaccuracyandreproducibilityinsarscov2nextgenerationsequenceanalysisforpublichealthsurveillance