Cargando…

A pan-genome-based machine learning approach for predicting antimicrobial resistance activities of the Escherichia coli strains

MOTIVATION: Antimicrobial resistance (AMR) is becoming a huge problem in both developed and developing countries, and identifying strains resistant or susceptible to certain antibiotics is essential in fighting against antibiotic-resistant pathogens. Whole-genome sequences have been collected for di...

Descripción completa

Detalles Bibliográficos
Autores principales: Her, Hsuan-Lin, Wu, Yu-Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6022653/
https://www.ncbi.nlm.nih.gov/pubmed/29949970
http://dx.doi.org/10.1093/bioinformatics/bty276
_version_ 1783335724069683200
author Her, Hsuan-Lin
Wu, Yu-Wei
author_facet Her, Hsuan-Lin
Wu, Yu-Wei
author_sort Her, Hsuan-Lin
collection PubMed
description MOTIVATION: Antimicrobial resistance (AMR) is becoming a huge problem in both developed and developing countries, and identifying strains resistant or susceptible to certain antibiotics is essential in fighting against antibiotic-resistant pathogens. Whole-genome sequences have been collected for different microbial strains in order to identify crucial characteristics that allow certain strains to become resistant to antibiotics; however, a global inspection of the gene content responsible for AMR activities remains to be done. RESULTS: We propose a pan-genome-based approach to characterize antibiotic-resistant microbial strains and test this approach on the bacterial model organism Escherichia coli. By identifying core and accessory gene clusters and predicting AMR genes for the E. coli pan-genome, we not only showed that certain classes of genes are unevenly distributed between the core and accessory parts of the pan-genome but also demonstrated that only a portion of the identified AMR genes belong to the accessory genome. Application of machine learning algorithms to predict whether specific strains were resistant to antibiotic drugs yielded the best prediction accuracy for the set of AMR genes within the accessory part of the pan-genome, suggesting that these gene clusters were most crucial to AMR activities in E. coli. Selecting subsets of AMR genes for different antibiotic drugs based on a genetic algorithm (GA) achieved better prediction performances than the gene sets established in the literature, hinting that the gene sets selected by the GA may warrant further analysis in investigating more details about how E. coli fight against antibiotics. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-6022653
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-60226532018-07-10 A pan-genome-based machine learning approach for predicting antimicrobial resistance activities of the Escherichia coli strains Her, Hsuan-Lin Wu, Yu-Wei Bioinformatics Ismb 2018–Intelligent Systems for Molecular Biology Proceedings MOTIVATION: Antimicrobial resistance (AMR) is becoming a huge problem in both developed and developing countries, and identifying strains resistant or susceptible to certain antibiotics is essential in fighting against antibiotic-resistant pathogens. Whole-genome sequences have been collected for different microbial strains in order to identify crucial characteristics that allow certain strains to become resistant to antibiotics; however, a global inspection of the gene content responsible for AMR activities remains to be done. RESULTS: We propose a pan-genome-based approach to characterize antibiotic-resistant microbial strains and test this approach on the bacterial model organism Escherichia coli. By identifying core and accessory gene clusters and predicting AMR genes for the E. coli pan-genome, we not only showed that certain classes of genes are unevenly distributed between the core and accessory parts of the pan-genome but also demonstrated that only a portion of the identified AMR genes belong to the accessory genome. Application of machine learning algorithms to predict whether specific strains were resistant to antibiotic drugs yielded the best prediction accuracy for the set of AMR genes within the accessory part of the pan-genome, suggesting that these gene clusters were most crucial to AMR activities in E. coli. Selecting subsets of AMR genes for different antibiotic drugs based on a genetic algorithm (GA) achieved better prediction performances than the gene sets established in the literature, hinting that the gene sets selected by the GA may warrant further analysis in investigating more details about how E. coli fight against antibiotics. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2018-07-01 2018-06-27 /pmc/articles/PMC6022653/ /pubmed/29949970 http://dx.doi.org/10.1093/bioinformatics/bty276 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Ismb 2018–Intelligent Systems for Molecular Biology Proceedings
Her, Hsuan-Lin
Wu, Yu-Wei
A pan-genome-based machine learning approach for predicting antimicrobial resistance activities of the Escherichia coli strains
title A pan-genome-based machine learning approach for predicting antimicrobial resistance activities of the Escherichia coli strains
title_full A pan-genome-based machine learning approach for predicting antimicrobial resistance activities of the Escherichia coli strains
title_fullStr A pan-genome-based machine learning approach for predicting antimicrobial resistance activities of the Escherichia coli strains
title_full_unstemmed A pan-genome-based machine learning approach for predicting antimicrobial resistance activities of the Escherichia coli strains
title_short A pan-genome-based machine learning approach for predicting antimicrobial resistance activities of the Escherichia coli strains
title_sort pan-genome-based machine learning approach for predicting antimicrobial resistance activities of the escherichia coli strains
topic Ismb 2018–Intelligent Systems for Molecular Biology Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6022653/
https://www.ncbi.nlm.nih.gov/pubmed/29949970
http://dx.doi.org/10.1093/bioinformatics/bty276
work_keys_str_mv AT herhsuanlin apangenomebasedmachinelearningapproachforpredictingantimicrobialresistanceactivitiesoftheescherichiacolistrains
AT wuyuwei apangenomebasedmachinelearningapproachforpredictingantimicrobialresistanceactivitiesoftheescherichiacolistrains
AT herhsuanlin pangenomebasedmachinelearningapproachforpredictingantimicrobialresistanceactivitiesoftheescherichiacolistrains
AT wuyuwei pangenomebasedmachinelearningapproachforpredictingantimicrobialresistanceactivitiesoftheescherichiacolistrains