Cargando…
Performance and Accuracy of Four Open-Source Tools for In Silico Serotyping of Salmonella spp. Based on Whole-Genome Short-Read Sequencing Data
We compared the performance of four open-source in silico Salmonella typing tools (SeqSero, SeqSero2, Salmonella In Silico Typing Resource [SISTR], and Metric Oriented Sequence Typer [MOST]) to assess their potential for replacing laboratory serological testing with serovar predictions from whole-ge...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Society for Microbiology
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7028957/ https://www.ncbi.nlm.nih.gov/pubmed/31862714 http://dx.doi.org/10.1128/AEM.02265-19 |
_version_ | 1783499075275980800 |
---|---|
author | Uelze, Laura Borowiak, Maria Deneke, Carlus Szabó, István Fischer, Jennie Tausch, Simon H. Malorny, Burkhard |
author_facet | Uelze, Laura Borowiak, Maria Deneke, Carlus Szabó, István Fischer, Jennie Tausch, Simon H. Malorny, Burkhard |
author_sort | Uelze, Laura |
collection | PubMed |
description | We compared the performance of four open-source in silico Salmonella typing tools (SeqSero, SeqSero2, Salmonella In Silico Typing Resource [SISTR], and Metric Oriented Sequence Typer [MOST]) to assess their potential for replacing laboratory serological testing with serovar predictions from whole-genome sequencing data. We conducted a retrospective analysis of 1,624 Salmonella isolates of 72 serovars submitted to the German National Salmonella Reference Laboratory between 1999 and 2019. All isolates are derived from animal and foodstuff origins. We conducted Illumina short-read sequencing and compared the in silico serovar prediction results with the results of routine laboratory serotyping. We found the best-performing in silico serovar prediction tool to be SISTR, with 94% correctly typed isolates, followed by SeqSero2 (87%), SeqSero (81%), and MOST (79%). Furthermore, we found that mapping-based tools like SeqSero and SeqSero2 (allele mode) were more reliable for the prediction of monophasic variants, while sequence type and cluster-based methods like MOST and SISTR (core-genome multilocus sequence type [cgMLST]), showed greater resilience when confronted with GC-biased sequencing data. We showed that the choice of library preparation kit could substantially affect O antigen detection, due to the low GC content of the wzx and wzy genes. Although the accuracy of computational serovar predictions is still not quite on par with traditional serotyping by Salmonella reference laboratories, the command-line tools investigated in this study perform a rapid, efficient, inexpensive, and reproducible analysis, which can be integrated into in-house characterization pipelines. Based on our results, we find SISTR most suitable for automated, routine serotyping for public health surveillance of Salmonella. IMPORTANCE Salmonella spp. are important foodborne pathogens. To reduce the number of infected patients, it is essential to understand which subtypes of the bacteria cause disease outbreaks. Traditionally, characterization of Salmonella requires serological testing, a laboratory method by which Salmonella isolates can be classified into over 2,600 distinct subtypes, called serovars. Due to recent advances in whole-genome sequencing, many tools have been developed to replace traditional testing methods with computational analysis of genome sequences. It is crucial to validate that these tools, many already in use for routine surveillance, deliver accurate and reliable serovar information. In this study, we set out to compare which of the currently available open-source command-line tools is most suitable to replace serological testing. A thorough evaluation of the differing computational approaches is highly important to ensure the backward compatibility of serotyping data and to maintain comparability between laboratories. |
format | Online Article Text |
id | pubmed-7028957 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | American Society for Microbiology |
record_format | MEDLINE/PubMed |
spelling | pubmed-70289572020-03-06 Performance and Accuracy of Four Open-Source Tools for In Silico Serotyping of Salmonella spp. Based on Whole-Genome Short-Read Sequencing Data Uelze, Laura Borowiak, Maria Deneke, Carlus Szabó, István Fischer, Jennie Tausch, Simon H. Malorny, Burkhard Appl Environ Microbiol Food Microbiology We compared the performance of four open-source in silico Salmonella typing tools (SeqSero, SeqSero2, Salmonella In Silico Typing Resource [SISTR], and Metric Oriented Sequence Typer [MOST]) to assess their potential for replacing laboratory serological testing with serovar predictions from whole-genome sequencing data. We conducted a retrospective analysis of 1,624 Salmonella isolates of 72 serovars submitted to the German National Salmonella Reference Laboratory between 1999 and 2019. All isolates are derived from animal and foodstuff origins. We conducted Illumina short-read sequencing and compared the in silico serovar prediction results with the results of routine laboratory serotyping. We found the best-performing in silico serovar prediction tool to be SISTR, with 94% correctly typed isolates, followed by SeqSero2 (87%), SeqSero (81%), and MOST (79%). Furthermore, we found that mapping-based tools like SeqSero and SeqSero2 (allele mode) were more reliable for the prediction of monophasic variants, while sequence type and cluster-based methods like MOST and SISTR (core-genome multilocus sequence type [cgMLST]), showed greater resilience when confronted with GC-biased sequencing data. We showed that the choice of library preparation kit could substantially affect O antigen detection, due to the low GC content of the wzx and wzy genes. Although the accuracy of computational serovar predictions is still not quite on par with traditional serotyping by Salmonella reference laboratories, the command-line tools investigated in this study perform a rapid, efficient, inexpensive, and reproducible analysis, which can be integrated into in-house characterization pipelines. Based on our results, we find SISTR most suitable for automated, routine serotyping for public health surveillance of Salmonella. IMPORTANCE Salmonella spp. are important foodborne pathogens. To reduce the number of infected patients, it is essential to understand which subtypes of the bacteria cause disease outbreaks. Traditionally, characterization of Salmonella requires serological testing, a laboratory method by which Salmonella isolates can be classified into over 2,600 distinct subtypes, called serovars. Due to recent advances in whole-genome sequencing, many tools have been developed to replace traditional testing methods with computational analysis of genome sequences. It is crucial to validate that these tools, many already in use for routine surveillance, deliver accurate and reliable serovar information. In this study, we set out to compare which of the currently available open-source command-line tools is most suitable to replace serological testing. A thorough evaluation of the differing computational approaches is highly important to ensure the backward compatibility of serotyping data and to maintain comparability between laboratories. American Society for Microbiology 2020-02-18 /pmc/articles/PMC7028957/ /pubmed/31862714 http://dx.doi.org/10.1128/AEM.02265-19 Text en Copyright © 2020 Uelze et al. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Food Microbiology Uelze, Laura Borowiak, Maria Deneke, Carlus Szabó, István Fischer, Jennie Tausch, Simon H. Malorny, Burkhard Performance and Accuracy of Four Open-Source Tools for In Silico Serotyping of Salmonella spp. Based on Whole-Genome Short-Read Sequencing Data |
title | Performance and Accuracy of Four Open-Source Tools for In Silico Serotyping of Salmonella spp. Based on Whole-Genome Short-Read Sequencing Data |
title_full | Performance and Accuracy of Four Open-Source Tools for In Silico Serotyping of Salmonella spp. Based on Whole-Genome Short-Read Sequencing Data |
title_fullStr | Performance and Accuracy of Four Open-Source Tools for In Silico Serotyping of Salmonella spp. Based on Whole-Genome Short-Read Sequencing Data |
title_full_unstemmed | Performance and Accuracy of Four Open-Source Tools for In Silico Serotyping of Salmonella spp. Based on Whole-Genome Short-Read Sequencing Data |
title_short | Performance and Accuracy of Four Open-Source Tools for In Silico Serotyping of Salmonella spp. Based on Whole-Genome Short-Read Sequencing Data |
title_sort | performance and accuracy of four open-source tools for in silico serotyping of salmonella spp. based on whole-genome short-read sequencing data |
topic | Food Microbiology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7028957/ https://www.ncbi.nlm.nih.gov/pubmed/31862714 http://dx.doi.org/10.1128/AEM.02265-19 |
work_keys_str_mv | AT uelzelaura performanceandaccuracyoffouropensourcetoolsforinsilicoserotypingofsalmonellasppbasedonwholegenomeshortreadsequencingdata AT borowiakmaria performanceandaccuracyoffouropensourcetoolsforinsilicoserotypingofsalmonellasppbasedonwholegenomeshortreadsequencingdata AT denekecarlus performanceandaccuracyoffouropensourcetoolsforinsilicoserotypingofsalmonellasppbasedonwholegenomeshortreadsequencingdata AT szaboistvan performanceandaccuracyoffouropensourcetoolsforinsilicoserotypingofsalmonellasppbasedonwholegenomeshortreadsequencingdata AT fischerjennie performanceandaccuracyoffouropensourcetoolsforinsilicoserotypingofsalmonellasppbasedonwholegenomeshortreadsequencingdata AT tauschsimonh performanceandaccuracyoffouropensourcetoolsforinsilicoserotypingofsalmonellasppbasedonwholegenomeshortreadsequencingdata AT malornyburkhard performanceandaccuracyoffouropensourcetoolsforinsilicoserotypingofsalmonellasppbasedonwholegenomeshortreadsequencingdata |