Cargando…
Polyphyly in widespread Salmonella enterica serovars and using genomic proximity to choose the best reference genome for bioinformatics analyses
Salmonella is the most common cause of gastroenteritis in the world. Over the past 5 years, whole-genome analysis has led to the high-resolution characterization of clinical and foodborne Salmonella responsible for typhoid fever, foodborne illness or contamination of the agro-food chain. Whole-genom...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9493441/ https://www.ncbi.nlm.nih.gov/pubmed/36159272 http://dx.doi.org/10.3389/fpubh.2022.963188 |
_version_ | 1784793720080564224 |
---|---|
author | Cherchame, Emeline Ilango, Guy Noël, Véronique Cadel-Six, Sabrina |
author_facet | Cherchame, Emeline Ilango, Guy Noël, Véronique Cadel-Six, Sabrina |
author_sort | Cherchame, Emeline |
collection | PubMed |
description | Salmonella is the most common cause of gastroenteritis in the world. Over the past 5 years, whole-genome analysis has led to the high-resolution characterization of clinical and foodborne Salmonella responsible for typhoid fever, foodborne illness or contamination of the agro-food chain. Whole-genome analyses are simplified by the availability of high-quality, complete genomes for mapping analysis and for calculating the pairwise distance between genomes, but unfortunately some difficulties may still remain. For some serovars, the complete genome is not available, or some serovars are polyphyletic and knowing the serovar alone is not sufficient for choosing the most appropriate reference genome. For these serovars, it is essential to identify the genetically closest complete genome to be able to carry out precise genome analyses. In this study, we explored the genomic proximity of 650 genomes of the 58 Salmonella enterica subsp. enterica serovars most frequently isolated in humans and from the food chain in the United States (US) and in Europe (EU), with a special focus on France. For each serovar, to take into account their genomic diversity, we included all the multilocus sequence type (MLST) profiles represented in EnteroBase with 10 or more genomes (on 19 July 2021). A phylogenetic analysis using both core- and pan-genome approaches was carried out to identify the genomic proximity of all the Salmonella studied and 20 polyphyletic serovars that have not yet been described in the literature. This study determined the genetic proximity between all 58 serovars studied and revealed polyphyletic serovars, their genomic lineages and MLST profiles. Finally, we enhanced the open-access databases with 73 new genomes and produced a list of high-quality complete reference genomes for 48 S. enterica subsp. enterica serovars among the most isolated in the US, EU, and France. |
format | Online Article Text |
id | pubmed-9493441 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-94934412022-09-23 Polyphyly in widespread Salmonella enterica serovars and using genomic proximity to choose the best reference genome for bioinformatics analyses Cherchame, Emeline Ilango, Guy Noël, Véronique Cadel-Six, Sabrina Front Public Health Public Health Salmonella is the most common cause of gastroenteritis in the world. Over the past 5 years, whole-genome analysis has led to the high-resolution characterization of clinical and foodborne Salmonella responsible for typhoid fever, foodborne illness or contamination of the agro-food chain. Whole-genome analyses are simplified by the availability of high-quality, complete genomes for mapping analysis and for calculating the pairwise distance between genomes, but unfortunately some difficulties may still remain. For some serovars, the complete genome is not available, or some serovars are polyphyletic and knowing the serovar alone is not sufficient for choosing the most appropriate reference genome. For these serovars, it is essential to identify the genetically closest complete genome to be able to carry out precise genome analyses. In this study, we explored the genomic proximity of 650 genomes of the 58 Salmonella enterica subsp. enterica serovars most frequently isolated in humans and from the food chain in the United States (US) and in Europe (EU), with a special focus on France. For each serovar, to take into account their genomic diversity, we included all the multilocus sequence type (MLST) profiles represented in EnteroBase with 10 or more genomes (on 19 July 2021). A phylogenetic analysis using both core- and pan-genome approaches was carried out to identify the genomic proximity of all the Salmonella studied and 20 polyphyletic serovars that have not yet been described in the literature. This study determined the genetic proximity between all 58 serovars studied and revealed polyphyletic serovars, their genomic lineages and MLST profiles. Finally, we enhanced the open-access databases with 73 new genomes and produced a list of high-quality complete reference genomes for 48 S. enterica subsp. enterica serovars among the most isolated in the US, EU, and France. Frontiers Media S.A. 2022-09-08 /pmc/articles/PMC9493441/ /pubmed/36159272 http://dx.doi.org/10.3389/fpubh.2022.963188 Text en Copyright © 2022 Cherchame, Ilango, Noël and Cadel-Six. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Public Health Cherchame, Emeline Ilango, Guy Noël, Véronique Cadel-Six, Sabrina Polyphyly in widespread Salmonella enterica serovars and using genomic proximity to choose the best reference genome for bioinformatics analyses |
title | Polyphyly in widespread Salmonella enterica serovars and using genomic proximity to choose the best reference genome for bioinformatics analyses |
title_full | Polyphyly in widespread Salmonella enterica serovars and using genomic proximity to choose the best reference genome for bioinformatics analyses |
title_fullStr | Polyphyly in widespread Salmonella enterica serovars and using genomic proximity to choose the best reference genome for bioinformatics analyses |
title_full_unstemmed | Polyphyly in widespread Salmonella enterica serovars and using genomic proximity to choose the best reference genome for bioinformatics analyses |
title_short | Polyphyly in widespread Salmonella enterica serovars and using genomic proximity to choose the best reference genome for bioinformatics analyses |
title_sort | polyphyly in widespread salmonella enterica serovars and using genomic proximity to choose the best reference genome for bioinformatics analyses |
topic | Public Health |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9493441/ https://www.ncbi.nlm.nih.gov/pubmed/36159272 http://dx.doi.org/10.3389/fpubh.2022.963188 |
work_keys_str_mv | AT cherchameemeline polyphylyinwidespreadsalmonellaentericaserovarsandusinggenomicproximitytochoosethebestreferencegenomeforbioinformaticsanalyses AT ilangoguy polyphylyinwidespreadsalmonellaentericaserovarsandusinggenomicproximitytochoosethebestreferencegenomeforbioinformaticsanalyses AT noelveronique polyphylyinwidespreadsalmonellaentericaserovarsandusinggenomicproximitytochoosethebestreferencegenomeforbioinformaticsanalyses AT cadelsixsabrina polyphylyinwidespreadsalmonellaentericaserovarsandusinggenomicproximitytochoosethebestreferencegenomeforbioinformaticsanalyses |