Cargando…

Polyphyly in widespread Salmonella enterica serovars and using genomic proximity to choose the best reference genome for bioinformatics analyses

Salmonella is the most common cause of gastroenteritis in the world. Over the past 5 years, whole-genome analysis has led to the high-resolution characterization of clinical and foodborne Salmonella responsible for typhoid fever, foodborne illness or contamination of the agro-food chain. Whole-genom...

Descripción completa

Detalles Bibliográficos
Autores principales: Cherchame, Emeline, Ilango, Guy, Noël, Véronique, Cadel-Six, Sabrina
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9493441/
https://www.ncbi.nlm.nih.gov/pubmed/36159272
http://dx.doi.org/10.3389/fpubh.2022.963188
_version_ 1784793720080564224
author Cherchame, Emeline
Ilango, Guy
Noël, Véronique
Cadel-Six, Sabrina
author_facet Cherchame, Emeline
Ilango, Guy
Noël, Véronique
Cadel-Six, Sabrina
author_sort Cherchame, Emeline
collection PubMed
description Salmonella is the most common cause of gastroenteritis in the world. Over the past 5 years, whole-genome analysis has led to the high-resolution characterization of clinical and foodborne Salmonella responsible for typhoid fever, foodborne illness or contamination of the agro-food chain. Whole-genome analyses are simplified by the availability of high-quality, complete genomes for mapping analysis and for calculating the pairwise distance between genomes, but unfortunately some difficulties may still remain. For some serovars, the complete genome is not available, or some serovars are polyphyletic and knowing the serovar alone is not sufficient for choosing the most appropriate reference genome. For these serovars, it is essential to identify the genetically closest complete genome to be able to carry out precise genome analyses. In this study, we explored the genomic proximity of 650 genomes of the 58 Salmonella enterica subsp. enterica serovars most frequently isolated in humans and from the food chain in the United States (US) and in Europe (EU), with a special focus on France. For each serovar, to take into account their genomic diversity, we included all the multilocus sequence type (MLST) profiles represented in EnteroBase with 10 or more genomes (on 19 July 2021). A phylogenetic analysis using both core- and pan-genome approaches was carried out to identify the genomic proximity of all the Salmonella studied and 20 polyphyletic serovars that have not yet been described in the literature. This study determined the genetic proximity between all 58 serovars studied and revealed polyphyletic serovars, their genomic lineages and MLST profiles. Finally, we enhanced the open-access databases with 73 new genomes and produced a list of high-quality complete reference genomes for 48 S. enterica subsp. enterica serovars among the most isolated in the US, EU, and France.
format Online
Article
Text
id pubmed-9493441
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-94934412022-09-23 Polyphyly in widespread Salmonella enterica serovars and using genomic proximity to choose the best reference genome for bioinformatics analyses Cherchame, Emeline Ilango, Guy Noël, Véronique Cadel-Six, Sabrina Front Public Health Public Health Salmonella is the most common cause of gastroenteritis in the world. Over the past 5 years, whole-genome analysis has led to the high-resolution characterization of clinical and foodborne Salmonella responsible for typhoid fever, foodborne illness or contamination of the agro-food chain. Whole-genome analyses are simplified by the availability of high-quality, complete genomes for mapping analysis and for calculating the pairwise distance between genomes, but unfortunately some difficulties may still remain. For some serovars, the complete genome is not available, or some serovars are polyphyletic and knowing the serovar alone is not sufficient for choosing the most appropriate reference genome. For these serovars, it is essential to identify the genetically closest complete genome to be able to carry out precise genome analyses. In this study, we explored the genomic proximity of 650 genomes of the 58 Salmonella enterica subsp. enterica serovars most frequently isolated in humans and from the food chain in the United States (US) and in Europe (EU), with a special focus on France. For each serovar, to take into account their genomic diversity, we included all the multilocus sequence type (MLST) profiles represented in EnteroBase with 10 or more genomes (on 19 July 2021). A phylogenetic analysis using both core- and pan-genome approaches was carried out to identify the genomic proximity of all the Salmonella studied and 20 polyphyletic serovars that have not yet been described in the literature. This study determined the genetic proximity between all 58 serovars studied and revealed polyphyletic serovars, their genomic lineages and MLST profiles. Finally, we enhanced the open-access databases with 73 new genomes and produced a list of high-quality complete reference genomes for 48 S. enterica subsp. enterica serovars among the most isolated in the US, EU, and France. Frontiers Media S.A. 2022-09-08 /pmc/articles/PMC9493441/ /pubmed/36159272 http://dx.doi.org/10.3389/fpubh.2022.963188 Text en Copyright © 2022 Cherchame, Ilango, Noël and Cadel-Six. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Public Health
Cherchame, Emeline
Ilango, Guy
Noël, Véronique
Cadel-Six, Sabrina
Polyphyly in widespread Salmonella enterica serovars and using genomic proximity to choose the best reference genome for bioinformatics analyses
title Polyphyly in widespread Salmonella enterica serovars and using genomic proximity to choose the best reference genome for bioinformatics analyses
title_full Polyphyly in widespread Salmonella enterica serovars and using genomic proximity to choose the best reference genome for bioinformatics analyses
title_fullStr Polyphyly in widespread Salmonella enterica serovars and using genomic proximity to choose the best reference genome for bioinformatics analyses
title_full_unstemmed Polyphyly in widespread Salmonella enterica serovars and using genomic proximity to choose the best reference genome for bioinformatics analyses
title_short Polyphyly in widespread Salmonella enterica serovars and using genomic proximity to choose the best reference genome for bioinformatics analyses
title_sort polyphyly in widespread salmonella enterica serovars and using genomic proximity to choose the best reference genome for bioinformatics analyses
topic Public Health
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9493441/
https://www.ncbi.nlm.nih.gov/pubmed/36159272
http://dx.doi.org/10.3389/fpubh.2022.963188
work_keys_str_mv AT cherchameemeline polyphylyinwidespreadsalmonellaentericaserovarsandusinggenomicproximitytochoosethebestreferencegenomeforbioinformaticsanalyses
AT ilangoguy polyphylyinwidespreadsalmonellaentericaserovarsandusinggenomicproximitytochoosethebestreferencegenomeforbioinformaticsanalyses
AT noelveronique polyphylyinwidespreadsalmonellaentericaserovarsandusinggenomicproximitytochoosethebestreferencegenomeforbioinformaticsanalyses
AT cadelsixsabrina polyphylyinwidespreadsalmonellaentericaserovarsandusinggenomicproximitytochoosethebestreferencegenomeforbioinformaticsanalyses