Cargando…

Using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (MIC)

Background: Predicting the resistance profiles of antimicrobial resistance (AMR) pathogens is becoming more and more important in treating infectious diseases. Various attempts have been made to build machine learning models to classify resistant or susceptible pathogens based on either known antimi...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Ming-Ren, Su, Shun-Feng, Wu, Yu-Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10267731/
https://www.ncbi.nlm.nih.gov/pubmed/37323667
http://dx.doi.org/10.3389/fgene.2023.1054032
_version_ 1785058987133108224
author Yang, Ming-Ren
Su, Shun-Feng
Wu, Yu-Wei
author_facet Yang, Ming-Ren
Su, Shun-Feng
Wu, Yu-Wei
author_sort Yang, Ming-Ren
collection PubMed
description Background: Predicting the resistance profiles of antimicrobial resistance (AMR) pathogens is becoming more and more important in treating infectious diseases. Various attempts have been made to build machine learning models to classify resistant or susceptible pathogens based on either known antimicrobial resistance genes or the entire gene set. However, the phenotypic annotations are translated from minimum inhibitory concentration (MIC), which is the lowest concentration of antibiotic drugs in inhibiting certain pathogenic strains. Since the MIC breakpoints that classify a strain to be resistant or susceptible to specific antibiotic drug may be revised by governing institutes, we refrained from translating these MIC values into the categories “susceptible” or “resistant” but instead attempted to predict the MIC values using machine learning approaches. Results: By applying a machine learning feature selection approach on a Salmonella enterica pan-genome, in which the protein sequences were clustered to identify highly similar gene families, we showed that the selected features (genes) performed better than known AMR genes, and that models built on the selected genes achieved very accurate MIC prediction. Functional analysis revealed that about half of the selected genes were annotated as hypothetical proteins (i.e., with unknown functional roles), and that only a small portion of known AMR genes were among the selected genes, indicating that applying feature selection on the entire gene set has the potential of uncovering novel genes that may be associated with and may contribute to pathogenic antimicrobial resistances. Conclusion: The application of the pan-genome-based machine learning approach was indeed capable of predicting MIC values with very high accuracy. The feature selection process may also identify novel AMR genes for inferring bacterial antimicrobial resistance phenotypes.
format Online
Article
Text
id pubmed-10267731
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-102677312023-06-15 Using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (MIC) Yang, Ming-Ren Su, Shun-Feng Wu, Yu-Wei Front Genet Genetics Background: Predicting the resistance profiles of antimicrobial resistance (AMR) pathogens is becoming more and more important in treating infectious diseases. Various attempts have been made to build machine learning models to classify resistant or susceptible pathogens based on either known antimicrobial resistance genes or the entire gene set. However, the phenotypic annotations are translated from minimum inhibitory concentration (MIC), which is the lowest concentration of antibiotic drugs in inhibiting certain pathogenic strains. Since the MIC breakpoints that classify a strain to be resistant or susceptible to specific antibiotic drug may be revised by governing institutes, we refrained from translating these MIC values into the categories “susceptible” or “resistant” but instead attempted to predict the MIC values using machine learning approaches. Results: By applying a machine learning feature selection approach on a Salmonella enterica pan-genome, in which the protein sequences were clustered to identify highly similar gene families, we showed that the selected features (genes) performed better than known AMR genes, and that models built on the selected genes achieved very accurate MIC prediction. Functional analysis revealed that about half of the selected genes were annotated as hypothetical proteins (i.e., with unknown functional roles), and that only a small portion of known AMR genes were among the selected genes, indicating that applying feature selection on the entire gene set has the potential of uncovering novel genes that may be associated with and may contribute to pathogenic antimicrobial resistances. Conclusion: The application of the pan-genome-based machine learning approach was indeed capable of predicting MIC values with very high accuracy. The feature selection process may also identify novel AMR genes for inferring bacterial antimicrobial resistance phenotypes. Frontiers Media S.A. 2023-05-30 /pmc/articles/PMC10267731/ /pubmed/37323667 http://dx.doi.org/10.3389/fgene.2023.1054032 Text en Copyright © 2023 Yang, Su and Wu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Yang, Ming-Ren
Su, Shun-Feng
Wu, Yu-Wei
Using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (MIC)
title Using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (MIC)
title_full Using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (MIC)
title_fullStr Using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (MIC)
title_full_unstemmed Using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (MIC)
title_short Using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (MIC)
title_sort using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (mic)
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10267731/
https://www.ncbi.nlm.nih.gov/pubmed/37323667
http://dx.doi.org/10.3389/fgene.2023.1054032
work_keys_str_mv AT yangmingren usingbacterialpangenomebasedfeatureselectionapproachtoimprovethepredictionofminimuminhibitoryconcentrationmic
AT sushunfeng usingbacterialpangenomebasedfeatureselectionapproachtoimprovethepredictionofminimuminhibitoryconcentrationmic
AT wuyuwei usingbacterialpangenomebasedfeatureselectionapproachtoimprovethepredictionofminimuminhibitoryconcentrationmic