Cargando…

Machine learning with random subspace ensembles identifies antimicrobial resistance determinants from pan-genomes of three pathogens

The evolution of antimicrobial resistance (AMR) poses a persistent threat to global public health. Sequencing efforts have already yielded genome sequences for thousands of resistant microbial isolates and require robust computational tools to systematically elucidate the genetic basis for AMR. Here...

Descripción completa

Detalles Bibliográficos
Autores principales: Hyun, Jason C., Kavvas, Erol S., Monk, Jonathan M., Palsson, Bernhard O.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7067475/
https://www.ncbi.nlm.nih.gov/pubmed/32119670
http://dx.doi.org/10.1371/journal.pcbi.1007608
_version_ 1783505410840330240
author Hyun, Jason C.
Kavvas, Erol S.
Monk, Jonathan M.
Palsson, Bernhard O.
author_facet Hyun, Jason C.
Kavvas, Erol S.
Monk, Jonathan M.
Palsson, Bernhard O.
author_sort Hyun, Jason C.
collection PubMed
description The evolution of antimicrobial resistance (AMR) poses a persistent threat to global public health. Sequencing efforts have already yielded genome sequences for thousands of resistant microbial isolates and require robust computational tools to systematically elucidate the genetic basis for AMR. Here, we present a generalizable machine learning workflow for identifying genetic features driving AMR based on constructing reference strain-agnostic pan-genomes and training random subspace ensembles (RSEs). This workflow was applied to the resistance profiles of 14 antimicrobials across three urgent threat pathogens encompassing 288 Staphylococcus aureus, 456 Pseudomonas aeruginosa, and 1588 Escherichia coli genomes. We find that feature selection by RSE detects known AMR associations more reliably than common statistical tests and previous ensemble approaches, identifying a total of 45 known AMR-conferring genes and alleles across the three organisms, as well as 25 candidate associations backed by domain-level annotations. Furthermore, we find that results from the RSE approach are consistent with existing understanding of fluoroquinolone (FQ) resistance due to mutations in the main drug targets, gyrA and parC, in all three organisms, and suggest the mutational landscape of those genes with respect to FQ resistance is simple. As larger datasets become available, we expect this approach to more reliably predict AMR determinants for a wider range of microbial pathogens.
format Online
Article
Text
id pubmed-7067475
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-70674752020-03-23 Machine learning with random subspace ensembles identifies antimicrobial resistance determinants from pan-genomes of three pathogens Hyun, Jason C. Kavvas, Erol S. Monk, Jonathan M. Palsson, Bernhard O. PLoS Comput Biol Research Article The evolution of antimicrobial resistance (AMR) poses a persistent threat to global public health. Sequencing efforts have already yielded genome sequences for thousands of resistant microbial isolates and require robust computational tools to systematically elucidate the genetic basis for AMR. Here, we present a generalizable machine learning workflow for identifying genetic features driving AMR based on constructing reference strain-agnostic pan-genomes and training random subspace ensembles (RSEs). This workflow was applied to the resistance profiles of 14 antimicrobials across three urgent threat pathogens encompassing 288 Staphylococcus aureus, 456 Pseudomonas aeruginosa, and 1588 Escherichia coli genomes. We find that feature selection by RSE detects known AMR associations more reliably than common statistical tests and previous ensemble approaches, identifying a total of 45 known AMR-conferring genes and alleles across the three organisms, as well as 25 candidate associations backed by domain-level annotations. Furthermore, we find that results from the RSE approach are consistent with existing understanding of fluoroquinolone (FQ) resistance due to mutations in the main drug targets, gyrA and parC, in all three organisms, and suggest the mutational landscape of those genes with respect to FQ resistance is simple. As larger datasets become available, we expect this approach to more reliably predict AMR determinants for a wider range of microbial pathogens. Public Library of Science 2020-03-02 /pmc/articles/PMC7067475/ /pubmed/32119670 http://dx.doi.org/10.1371/journal.pcbi.1007608 Text en © 2020 Hyun et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Hyun, Jason C.
Kavvas, Erol S.
Monk, Jonathan M.
Palsson, Bernhard O.
Machine learning with random subspace ensembles identifies antimicrobial resistance determinants from pan-genomes of three pathogens
title Machine learning with random subspace ensembles identifies antimicrobial resistance determinants from pan-genomes of three pathogens
title_full Machine learning with random subspace ensembles identifies antimicrobial resistance determinants from pan-genomes of three pathogens
title_fullStr Machine learning with random subspace ensembles identifies antimicrobial resistance determinants from pan-genomes of three pathogens
title_full_unstemmed Machine learning with random subspace ensembles identifies antimicrobial resistance determinants from pan-genomes of three pathogens
title_short Machine learning with random subspace ensembles identifies antimicrobial resistance determinants from pan-genomes of three pathogens
title_sort machine learning with random subspace ensembles identifies antimicrobial resistance determinants from pan-genomes of three pathogens
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7067475/
https://www.ncbi.nlm.nih.gov/pubmed/32119670
http://dx.doi.org/10.1371/journal.pcbi.1007608
work_keys_str_mv AT hyunjasonc machinelearningwithrandomsubspaceensemblesidentifiesantimicrobialresistancedeterminantsfrompangenomesofthreepathogens
AT kavvaserols machinelearningwithrandomsubspaceensemblesidentifiesantimicrobialresistancedeterminantsfrompangenomesofthreepathogens
AT monkjonathanm machinelearningwithrandomsubspaceensemblesidentifiesantimicrobialresistancedeterminantsfrompangenomesofthreepathogens
AT palssonbernhardo machinelearningwithrandomsubspaceensemblesidentifiesantimicrobialresistancedeterminantsfrompangenomesofthreepathogens