Cargando…

Large-scale comparative genomics to refine the organization of the global Salmonella enterica population structure

The White–Kauffmann–Le Minor (WKL) scheme is the most widely used Salmonella typing scheme for reporting the disease prevalence of the enteric pathogen. With the advent of whole-genome sequencing (WGS), in silico methods have increasingly replaced traditional serotyping due to reproducibility, speed...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Chao Chun, Hsiao, William W. L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Microbiology Society 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9837569/
https://www.ncbi.nlm.nih.gov/pubmed/36748524
http://dx.doi.org/10.1099/mgen.0.000906
_version_ 1784869109947695104
author Liu, Chao Chun
Hsiao, William W. L.
author_facet Liu, Chao Chun
Hsiao, William W. L.
author_sort Liu, Chao Chun
collection PubMed
description The White–Kauffmann–Le Minor (WKL) scheme is the most widely used Salmonella typing scheme for reporting the disease prevalence of the enteric pathogen. With the advent of whole-genome sequencing (WGS), in silico methods have increasingly replaced traditional serotyping due to reproducibility, speed and coverage. However, despite integrating genomic-based typing by in silico serotyping tools such as SISTR, in silico serotyping in certain contexts remains ambiguous and insufficiently informative. Specifically, in silico serotyping does not attempt to resolve polyphyly. Furthermore, in spite of the widespread acknowledgement of polyphyly from genomic studies, the prevalence of polyphyletic serovars is not well characterized. Here, we applied a genomics approach to acquire the necessary resolution to classify genetically discordant serovars and propose an alternative typing scheme that consistently reflect natural Salmonella populations. By accessing the unprecedented volume of bacterial genomic data publicly available in GenomeTrakr and PubMLST databases (>180 000 genomes representing 723 serovars), we characterized the global Salmonella population structure and systematically identified putative non-monophyletic serovars. The proportion of putative non-monophyletic serovars was estimated higher than previous reports, reinforcing the inability of antigenic determinants to depict the complexity of Salmonella evolutionary history. We explored the extent of genetic diversity masked by serotyping labels and found significant intra-serovar molecular differences across many clinically important serovars. To avoid false discovery due to incorrect in silico serotyping calls, we cross-referenced reported serovar labels and concluded a low error rate in in silico serotyping. The combined application of clustering statistics and genome-wide association methods demonstrated effective characterization of stable bacterial populations and explained functional differences. The collective methods adopted in our study have practical values in establishing genomic-based typing nomenclatures for an entire microbial species or closely related subpopulations. Ultimately, we foresee an improved typing scheme to be a hybrid that integrates both genomic and antigenic information such that the resolution from WGS is leveraged to improve the precision of subpopulation classification while preserving the common names defined by the WKL scheme.
format Online
Article
Text
id pubmed-9837569
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Microbiology Society
record_format MEDLINE/PubMed
spelling pubmed-98375692023-01-13 Large-scale comparative genomics to refine the organization of the global Salmonella enterica population structure Liu, Chao Chun Hsiao, William W. L. Microb Genom Research Articles The White–Kauffmann–Le Minor (WKL) scheme is the most widely used Salmonella typing scheme for reporting the disease prevalence of the enteric pathogen. With the advent of whole-genome sequencing (WGS), in silico methods have increasingly replaced traditional serotyping due to reproducibility, speed and coverage. However, despite integrating genomic-based typing by in silico serotyping tools such as SISTR, in silico serotyping in certain contexts remains ambiguous and insufficiently informative. Specifically, in silico serotyping does not attempt to resolve polyphyly. Furthermore, in spite of the widespread acknowledgement of polyphyly from genomic studies, the prevalence of polyphyletic serovars is not well characterized. Here, we applied a genomics approach to acquire the necessary resolution to classify genetically discordant serovars and propose an alternative typing scheme that consistently reflect natural Salmonella populations. By accessing the unprecedented volume of bacterial genomic data publicly available in GenomeTrakr and PubMLST databases (>180 000 genomes representing 723 serovars), we characterized the global Salmonella population structure and systematically identified putative non-monophyletic serovars. The proportion of putative non-monophyletic serovars was estimated higher than previous reports, reinforcing the inability of antigenic determinants to depict the complexity of Salmonella evolutionary history. We explored the extent of genetic diversity masked by serotyping labels and found significant intra-serovar molecular differences across many clinically important serovars. To avoid false discovery due to incorrect in silico serotyping calls, we cross-referenced reported serovar labels and concluded a low error rate in in silico serotyping. The combined application of clustering statistics and genome-wide association methods demonstrated effective characterization of stable bacterial populations and explained functional differences. The collective methods adopted in our study have practical values in establishing genomic-based typing nomenclatures for an entire microbial species or closely related subpopulations. Ultimately, we foresee an improved typing scheme to be a hybrid that integrates both genomic and antigenic information such that the resolution from WGS is leveraged to improve the precision of subpopulation classification while preserving the common names defined by the WKL scheme. Microbiology Society 2022-12-07 /pmc/articles/PMC9837569/ /pubmed/36748524 http://dx.doi.org/10.1099/mgen.0.000906 Text en © 2022 The Authors https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License. This article was made open access via a Publish and Read agreement between the Microbiology Society and the corresponding author’s institution.
spellingShingle Research Articles
Liu, Chao Chun
Hsiao, William W. L.
Large-scale comparative genomics to refine the organization of the global Salmonella enterica population structure
title Large-scale comparative genomics to refine the organization of the global Salmonella enterica population structure
title_full Large-scale comparative genomics to refine the organization of the global Salmonella enterica population structure
title_fullStr Large-scale comparative genomics to refine the organization of the global Salmonella enterica population structure
title_full_unstemmed Large-scale comparative genomics to refine the organization of the global Salmonella enterica population structure
title_short Large-scale comparative genomics to refine the organization of the global Salmonella enterica population structure
title_sort large-scale comparative genomics to refine the organization of the global salmonella enterica population structure
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9837569/
https://www.ncbi.nlm.nih.gov/pubmed/36748524
http://dx.doi.org/10.1099/mgen.0.000906
work_keys_str_mv AT liuchaochun largescalecomparativegenomicstorefinetheorganizationoftheglobalsalmonellaentericapopulationstructure
AT hsiaowilliamwl largescalecomparativegenomicstorefinetheorganizationoftheglobalsalmonellaentericapopulationstructure