Cargando…

In Silico Evaluation of Variant Calling Methods for Bacterial Whole-Genome Sequencing Assays

Identification and analysis of clinically relevant strains of bacteria increasingly relies on whole-genome sequencing. The downstream bioinformatics steps necessary for calling variants from short-read sequences are well-established but seldom validated against haploid genomes. We devised an in sili...

Descripción completa

Detalles Bibliográficos
Autores principales: Seah, Yee Mey, Stewart, Mary K., Hoogestraat, Daniel, Ryder, Molly, Cookson, Brad T., Salipante, Stephen J., Hoffman, Noah G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10446864/
https://www.ncbi.nlm.nih.gov/pubmed/37428072
http://dx.doi.org/10.1128/jcm.01842-22
_version_ 1785094418397659136
author Seah, Yee Mey
Stewart, Mary K.
Hoogestraat, Daniel
Ryder, Molly
Cookson, Brad T.
Salipante, Stephen J.
Hoffman, Noah G.
author_facet Seah, Yee Mey
Stewart, Mary K.
Hoogestraat, Daniel
Ryder, Molly
Cookson, Brad T.
Salipante, Stephen J.
Hoffman, Noah G.
author_sort Seah, Yee Mey
collection PubMed
description Identification and analysis of clinically relevant strains of bacteria increasingly relies on whole-genome sequencing. The downstream bioinformatics steps necessary for calling variants from short-read sequences are well-established but seldom validated against haploid genomes. We devised an in silico workflow to introduce single nucleotide polymorphisms (SNP) and indels into bacterial reference genomes, and computationally generate sequencing reads based on the mutated genomes. We then applied the method to Mycobacterium tuberculosis H37Rv, Staphylococcus aureus NCTC 8325, and Klebsiella pneumoniae HS11286, and used the synthetic reads as truth sets for evaluating several popular variant callers. Insertions proved especially challenging for most variant callers to correctly identify, relative to deletions and single nucleotide polymorphisms. With adequate read depth, however, variant callers that use high quality soft-clipped reads and base mismatches to perform local realignment consistently had the highest precision and recall in identifying insertions and deletions ranging from1 to 50 bp. The remaining variant callers had lower recall values associated with identification of insertions greater than 20 bp.
format Online
Article
Text
id pubmed-10446864
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-104468642023-08-24 In Silico Evaluation of Variant Calling Methods for Bacterial Whole-Genome Sequencing Assays Seah, Yee Mey Stewart, Mary K. Hoogestraat, Daniel Ryder, Molly Cookson, Brad T. Salipante, Stephen J. Hoffman, Noah G. J Clin Microbiol Bacteriology Identification and analysis of clinically relevant strains of bacteria increasingly relies on whole-genome sequencing. The downstream bioinformatics steps necessary for calling variants from short-read sequences are well-established but seldom validated against haploid genomes. We devised an in silico workflow to introduce single nucleotide polymorphisms (SNP) and indels into bacterial reference genomes, and computationally generate sequencing reads based on the mutated genomes. We then applied the method to Mycobacterium tuberculosis H37Rv, Staphylococcus aureus NCTC 8325, and Klebsiella pneumoniae HS11286, and used the synthetic reads as truth sets for evaluating several popular variant callers. Insertions proved especially challenging for most variant callers to correctly identify, relative to deletions and single nucleotide polymorphisms. With adequate read depth, however, variant callers that use high quality soft-clipped reads and base mismatches to perform local realignment consistently had the highest precision and recall in identifying insertions and deletions ranging from1 to 50 bp. The remaining variant callers had lower recall values associated with identification of insertions greater than 20 bp. American Society for Microbiology 2023-07-10 /pmc/articles/PMC10446864/ /pubmed/37428072 http://dx.doi.org/10.1128/jcm.01842-22 Text en Copyright © 2023 Seah et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Bacteriology
Seah, Yee Mey
Stewart, Mary K.
Hoogestraat, Daniel
Ryder, Molly
Cookson, Brad T.
Salipante, Stephen J.
Hoffman, Noah G.
In Silico Evaluation of Variant Calling Methods for Bacterial Whole-Genome Sequencing Assays
title In Silico Evaluation of Variant Calling Methods for Bacterial Whole-Genome Sequencing Assays
title_full In Silico Evaluation of Variant Calling Methods for Bacterial Whole-Genome Sequencing Assays
title_fullStr In Silico Evaluation of Variant Calling Methods for Bacterial Whole-Genome Sequencing Assays
title_full_unstemmed In Silico Evaluation of Variant Calling Methods for Bacterial Whole-Genome Sequencing Assays
title_short In Silico Evaluation of Variant Calling Methods for Bacterial Whole-Genome Sequencing Assays
title_sort in silico evaluation of variant calling methods for bacterial whole-genome sequencing assays
topic Bacteriology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10446864/
https://www.ncbi.nlm.nih.gov/pubmed/37428072
http://dx.doi.org/10.1128/jcm.01842-22
work_keys_str_mv AT seahyeemey insilicoevaluationofvariantcallingmethodsforbacterialwholegenomesequencingassays
AT stewartmaryk insilicoevaluationofvariantcallingmethodsforbacterialwholegenomesequencingassays
AT hoogestraatdaniel insilicoevaluationofvariantcallingmethodsforbacterialwholegenomesequencingassays
AT rydermolly insilicoevaluationofvariantcallingmethodsforbacterialwholegenomesequencingassays
AT cooksonbradt insilicoevaluationofvariantcallingmethodsforbacterialwholegenomesequencingassays
AT salipantestephenj insilicoevaluationofvariantcallingmethodsforbacterialwholegenomesequencingassays
AT hoffmannoahg insilicoevaluationofvariantcallingmethodsforbacterialwholegenomesequencingassays