Cargando…

Patchy promiscuity: machine learning applied to predict the host specificity of Salmonella enterica and Escherichia coli

Salmonella enterica and Escherichia coli are bacterial species that colonize different animal hosts with sub-types that can cause life-threatening infections in humans. Source attribution of zoonoses is an important goal for infection control as is identification of isolates in reservoir hosts that...

Descripción completa

Detalles Bibliográficos
Autores principales: Lupolova, Nadejda, Dallman, Tim J., Holden, Nicola J., Gally, David L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Microbiology Society 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5695212/
https://www.ncbi.nlm.nih.gov/pubmed/29177093
http://dx.doi.org/10.1099/mgen.0.000135
_version_ 1783280275138019328
author Lupolova, Nadejda
Dallman, Tim J.
Holden, Nicola J.
Gally, David L.
author_facet Lupolova, Nadejda
Dallman, Tim J.
Holden, Nicola J.
Gally, David L.
author_sort Lupolova, Nadejda
collection PubMed
description Salmonella enterica and Escherichia coli are bacterial species that colonize different animal hosts with sub-types that can cause life-threatening infections in humans. Source attribution of zoonoses is an important goal for infection control as is identification of isolates in reservoir hosts that represent a threat to human health. In this study, host specificity and zoonotic potential were predicted using machine learning in which Support Vector Machine (SVM) classifiers were built based on predicted proteins from whole genome sequences. Analysis of over 1000 S. enterica genomes allowed the correct prediction (67 –90 % accuracy) of the source host for S. Typhimurium isolates and the same classifier could then differentiate the source host for alternative serovars such as S. Dublin. A key finding from both phylogeny and SVM methods was that the majority of isolates were assigned to host-specific sub-clusters and had high host-specific SVM scores. Moreover, only a minor subset of isolates had high probability scores for multiple hosts, indicating generalists with genetic content that may facilitate transition between hosts. The same approach correctly identified human versus bovine E. coli isolates (83 % accuracy) and the potential of the classifier to predict a zoonotic threat was demonstrated using E. coli O157. This research indicates marked host restriction for both S. enterica and E. coli, with only limited isolate subsets exhibiting host promiscuity by gene content. Machine learning can be successfully applied to interrogate source attribution of bacterial isolates and has the capacity to predict zoonotic potential.
format Online
Article
Text
id pubmed-5695212
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Microbiology Society
record_format MEDLINE/PubMed
spelling pubmed-56952122017-11-24 Patchy promiscuity: machine learning applied to predict the host specificity of Salmonella enterica and Escherichia coli Lupolova, Nadejda Dallman, Tim J. Holden, Nicola J. Gally, David L. Microb Genom Research Article Salmonella enterica and Escherichia coli are bacterial species that colonize different animal hosts with sub-types that can cause life-threatening infections in humans. Source attribution of zoonoses is an important goal for infection control as is identification of isolates in reservoir hosts that represent a threat to human health. In this study, host specificity and zoonotic potential were predicted using machine learning in which Support Vector Machine (SVM) classifiers were built based on predicted proteins from whole genome sequences. Analysis of over 1000 S. enterica genomes allowed the correct prediction (67 –90 % accuracy) of the source host for S. Typhimurium isolates and the same classifier could then differentiate the source host for alternative serovars such as S. Dublin. A key finding from both phylogeny and SVM methods was that the majority of isolates were assigned to host-specific sub-clusters and had high host-specific SVM scores. Moreover, only a minor subset of isolates had high probability scores for multiple hosts, indicating generalists with genetic content that may facilitate transition between hosts. The same approach correctly identified human versus bovine E. coli isolates (83 % accuracy) and the potential of the classifier to predict a zoonotic threat was demonstrated using E. coli O157. This research indicates marked host restriction for both S. enterica and E. coli, with only limited isolate subsets exhibiting host promiscuity by gene content. Machine learning can be successfully applied to interrogate source attribution of bacterial isolates and has the capacity to predict zoonotic potential. Microbiology Society 2017-10-03 /pmc/articles/PMC5695212/ /pubmed/29177093 http://dx.doi.org/10.1099/mgen.0.000135 Text en © 2017 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open access article under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Lupolova, Nadejda
Dallman, Tim J.
Holden, Nicola J.
Gally, David L.
Patchy promiscuity: machine learning applied to predict the host specificity of Salmonella enterica and Escherichia coli
title Patchy promiscuity: machine learning applied to predict the host specificity of Salmonella enterica and Escherichia coli
title_full Patchy promiscuity: machine learning applied to predict the host specificity of Salmonella enterica and Escherichia coli
title_fullStr Patchy promiscuity: machine learning applied to predict the host specificity of Salmonella enterica and Escherichia coli
title_full_unstemmed Patchy promiscuity: machine learning applied to predict the host specificity of Salmonella enterica and Escherichia coli
title_short Patchy promiscuity: machine learning applied to predict the host specificity of Salmonella enterica and Escherichia coli
title_sort patchy promiscuity: machine learning applied to predict the host specificity of salmonella enterica and escherichia coli
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5695212/
https://www.ncbi.nlm.nih.gov/pubmed/29177093
http://dx.doi.org/10.1099/mgen.0.000135
work_keys_str_mv AT lupolovanadejda patchypromiscuitymachinelearningappliedtopredictthehostspecificityofsalmonellaentericaandescherichiacoli
AT dallmantimj patchypromiscuitymachinelearningappliedtopredictthehostspecificityofsalmonellaentericaandescherichiacoli
AT holdennicolaj patchypromiscuitymachinelearningappliedtopredictthehostspecificityofsalmonellaentericaandescherichiacoli
AT gallydavidl patchypromiscuitymachinelearningappliedtopredictthehostspecificityofsalmonellaentericaandescherichiacoli