Cargando…

Predicting antimicrobial resistance using conserved genes

A growing number of studies are using machine learning models to accurately predict antimicrobial resistance (AMR) phenotypes from bacterial sequence data. Although these studies are showing promise, the models are typically trained using features derived from comprehensive sets of AMR genes or whol...

Descripción completa

Detalles Bibliográficos
Autores principales:	Nguyen, Marcus, Olson, Robert, Shukla, Maulik, VanOeffelen, Margo, Davis, James J.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2020
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7595632/ https://www.ncbi.nlm.nih.gov/pubmed/33075053 http://dx.doi.org/10.1371/journal.pcbi.1008319

_version_	1783601919905759232
author	Nguyen, Marcus Olson, Robert Shukla, Maulik VanOeffelen, Margo Davis, James J.
author_facet	Nguyen, Marcus Olson, Robert Shukla, Maulik VanOeffelen, Margo Davis, James J.
author_sort	Nguyen, Marcus
collection	PubMed
description	A growing number of studies are using machine learning models to accurately predict antimicrobial resistance (AMR) phenotypes from bacterial sequence data. Although these studies are showing promise, the models are typically trained using features derived from comprehensive sets of AMR genes or whole genome sequences and may not be suitable for use when genomes are incomplete. In this study, we explore the possibility of predicting AMR phenotypes using incomplete genome sequence data. Models were built from small sets of randomly-selected core genes after removing the AMR genes. For Klebsiella pneumoniae, Mycobacterium tuberculosis, Salmonella enterica, and Staphylococcus aureus, we report that it is possible to classify susceptible and resistant phenotypes with average F1 scores ranging from 0.80–0.89 with as few as 100 conserved non-AMR genes, with very major error rates ranging from 0.11–0.23 and major error rates ranging from 0.10–0.20. Models built from core genes have predictive power in cases where the primary AMR mechanisms result from SNPs or horizontal gene transfer. By randomly sampling non-overlapping sets of core genes, we show that F1 scores and error rates are stable and have little variance between replicates. Although these small core gene models have lower accuracies and higher error rates than models built from the corresponding assembled genomes, the results suggest that sufficient variation exists in the core non-AMR genes of a species for predicting AMR phenotypes.
format	Online Article Text
id	pubmed-7595632
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-75956322020-11-03 Predicting antimicrobial resistance using conserved genes Nguyen, Marcus Olson, Robert Shukla, Maulik VanOeffelen, Margo Davis, James J. PLoS Comput Biol Research Article A growing number of studies are using machine learning models to accurately predict antimicrobial resistance (AMR) phenotypes from bacterial sequence data. Although these studies are showing promise, the models are typically trained using features derived from comprehensive sets of AMR genes or whole genome sequences and may not be suitable for use when genomes are incomplete. In this study, we explore the possibility of predicting AMR phenotypes using incomplete genome sequence data. Models were built from small sets of randomly-selected core genes after removing the AMR genes. For Klebsiella pneumoniae, Mycobacterium tuberculosis, Salmonella enterica, and Staphylococcus aureus, we report that it is possible to classify susceptible and resistant phenotypes with average F1 scores ranging from 0.80–0.89 with as few as 100 conserved non-AMR genes, with very major error rates ranging from 0.11–0.23 and major error rates ranging from 0.10–0.20. Models built from core genes have predictive power in cases where the primary AMR mechanisms result from SNPs or horizontal gene transfer. By randomly sampling non-overlapping sets of core genes, we show that F1 scores and error rates are stable and have little variance between replicates. Although these small core gene models have lower accuracies and higher error rates than models built from the corresponding assembled genomes, the results suggest that sufficient variation exists in the core non-AMR genes of a species for predicting AMR phenotypes. Public Library of Science 2020-10-19 /pmc/articles/PMC7595632/ /pubmed/33075053 http://dx.doi.org/10.1371/journal.pcbi.1008319 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 (https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication.
spellingShingle	Research Article Nguyen, Marcus Olson, Robert Shukla, Maulik VanOeffelen, Margo Davis, James J. Predicting antimicrobial resistance using conserved genes
title	Predicting antimicrobial resistance using conserved genes
title_full	Predicting antimicrobial resistance using conserved genes
title_fullStr	Predicting antimicrobial resistance using conserved genes
title_full_unstemmed	Predicting antimicrobial resistance using conserved genes
title_short	Predicting antimicrobial resistance using conserved genes
title_sort	predicting antimicrobial resistance using conserved genes
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7595632/ https://www.ncbi.nlm.nih.gov/pubmed/33075053 http://dx.doi.org/10.1371/journal.pcbi.1008319
work_keys_str_mv	AT nguyenmarcus predictingantimicrobialresistanceusingconservedgenes AT olsonrobert predictingantimicrobialresistanceusingconservedgenes AT shuklamaulik predictingantimicrobialresistanceusingconservedgenes AT vanoeffelenmargo predictingantimicrobialresistanceusingconservedgenes AT davisjamesj predictingantimicrobialresistanceusingconservedgenes

Predicting antimicrobial resistance using conserved genes

Ejemplares similares