Cargando…
Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning
MOTIVATION: Antimicrobial resistance (AMR) is one of the biggest global problems threatening human and animal health. Rapid and accurate AMR diagnostic methods are thus very urgently needed. However, traditional antimicrobial susceptibility testing (AST) is time-consuming, low throughput and viable...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8722762/ https://www.ncbi.nlm.nih.gov/pubmed/34613360 http://dx.doi.org/10.1093/bioinformatics/btab681 |
_version_ | 1784625582695251968 |
---|---|
author | Ren, Yunxiao Chakraborty, Trinad Doijad, Swapnil Falgenhauer, Linda Falgenhauer, Jane Goesmann, Alexander Hauschild, Anne-Christin Schwengers, Oliver Heider, Dominik |
author_facet | Ren, Yunxiao Chakraborty, Trinad Doijad, Swapnil Falgenhauer, Linda Falgenhauer, Jane Goesmann, Alexander Hauschild, Anne-Christin Schwengers, Oliver Heider, Dominik |
author_sort | Ren, Yunxiao |
collection | PubMed |
description | MOTIVATION: Antimicrobial resistance (AMR) is one of the biggest global problems threatening human and animal health. Rapid and accurate AMR diagnostic methods are thus very urgently needed. However, traditional antimicrobial susceptibility testing (AST) is time-consuming, low throughput and viable only for cultivable bacteria. Machine learning methods may pave the way for automated AMR prediction based on genomic data of the bacteria. However, comparing different machine learning methods for the prediction of AMR based on different encodings and whole-genome sequencing data without previously known knowledge remains to be done. RESULTS: In this study, we evaluated logistic regression (LR), support vector machine (SVM), random forest (RF) and convolutional neural network (CNN) for the prediction of AMR for the antibiotics ciprofloxacin, cefotaxime, ceftazidime and gentamicin. We could demonstrate that these models can effectively predict AMR with label encoding, one-hot encoding and frequency matrix chaos game representation (FCGR encoding) on whole-genome sequencing data. We trained these models on a large AMR dataset and evaluated them on an independent public dataset. Generally, RFs and CNNs perform better than LR and SVM with AUCs up to 0.96. Furthermore, we were able to identify mutations that are associated with AMR for each antibiotic. AVAILABILITY AND IMPLEMENTATION: Source code in data preparation and model training are provided at GitHub website (https://github.com/YunxiaoRen/ML-iAMR). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-8722762 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-87227622022-01-05 Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning Ren, Yunxiao Chakraborty, Trinad Doijad, Swapnil Falgenhauer, Linda Falgenhauer, Jane Goesmann, Alexander Hauschild, Anne-Christin Schwengers, Oliver Heider, Dominik Bioinformatics Original Paper MOTIVATION: Antimicrobial resistance (AMR) is one of the biggest global problems threatening human and animal health. Rapid and accurate AMR diagnostic methods are thus very urgently needed. However, traditional antimicrobial susceptibility testing (AST) is time-consuming, low throughput and viable only for cultivable bacteria. Machine learning methods may pave the way for automated AMR prediction based on genomic data of the bacteria. However, comparing different machine learning methods for the prediction of AMR based on different encodings and whole-genome sequencing data without previously known knowledge remains to be done. RESULTS: In this study, we evaluated logistic regression (LR), support vector machine (SVM), random forest (RF) and convolutional neural network (CNN) for the prediction of AMR for the antibiotics ciprofloxacin, cefotaxime, ceftazidime and gentamicin. We could demonstrate that these models can effectively predict AMR with label encoding, one-hot encoding and frequency matrix chaos game representation (FCGR encoding) on whole-genome sequencing data. We trained these models on a large AMR dataset and evaluated them on an independent public dataset. Generally, RFs and CNNs perform better than LR and SVM with AUCs up to 0.96. Furthermore, we were able to identify mutations that are associated with AMR for each antibiotic. AVAILABILITY AND IMPLEMENTATION: Source code in data preparation and model training are provided at GitHub website (https://github.com/YunxiaoRen/ML-iAMR). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-10-06 /pmc/articles/PMC8722762/ /pubmed/34613360 http://dx.doi.org/10.1093/bioinformatics/btab681 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Original Paper Ren, Yunxiao Chakraborty, Trinad Doijad, Swapnil Falgenhauer, Linda Falgenhauer, Jane Goesmann, Alexander Hauschild, Anne-Christin Schwengers, Oliver Heider, Dominik Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning |
title | Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning |
title_full | Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning |
title_fullStr | Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning |
title_full_unstemmed | Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning |
title_short | Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning |
title_sort | prediction of antimicrobial resistance based on whole-genome sequencing and machine learning |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8722762/ https://www.ncbi.nlm.nih.gov/pubmed/34613360 http://dx.doi.org/10.1093/bioinformatics/btab681 |
work_keys_str_mv | AT renyunxiao predictionofantimicrobialresistancebasedonwholegenomesequencingandmachinelearning AT chakrabortytrinad predictionofantimicrobialresistancebasedonwholegenomesequencingandmachinelearning AT doijadswapnil predictionofantimicrobialresistancebasedonwholegenomesequencingandmachinelearning AT falgenhauerlinda predictionofantimicrobialresistancebasedonwholegenomesequencingandmachinelearning AT falgenhauerjane predictionofantimicrobialresistancebasedonwholegenomesequencingandmachinelearning AT goesmannalexander predictionofantimicrobialresistancebasedonwholegenomesequencingandmachinelearning AT hauschildannechristin predictionofantimicrobialresistancebasedonwholegenomesequencingandmachinelearning AT schwengersoliver predictionofantimicrobialresistancebasedonwholegenomesequencingandmachinelearning AT heiderdominik predictionofantimicrobialresistancebasedonwholegenomesequencingandmachinelearning |