Cargando…

Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning

MOTIVATION: Antimicrobial resistance (AMR) is one of the biggest global problems threatening human and animal health. Rapid and accurate AMR diagnostic methods are thus very urgently needed. However, traditional antimicrobial susceptibility testing (AST) is time-consuming, low throughput and viable...

Descripción completa

Detalles Bibliográficos
Autores principales: Ren, Yunxiao, Chakraborty, Trinad, Doijad, Swapnil, Falgenhauer, Linda, Falgenhauer, Jane, Goesmann, Alexander, Hauschild, Anne-Christin, Schwengers, Oliver, Heider, Dominik
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8722762/
https://www.ncbi.nlm.nih.gov/pubmed/34613360
http://dx.doi.org/10.1093/bioinformatics/btab681
_version_ 1784625582695251968
author Ren, Yunxiao
Chakraborty, Trinad
Doijad, Swapnil
Falgenhauer, Linda
Falgenhauer, Jane
Goesmann, Alexander
Hauschild, Anne-Christin
Schwengers, Oliver
Heider, Dominik
author_facet Ren, Yunxiao
Chakraborty, Trinad
Doijad, Swapnil
Falgenhauer, Linda
Falgenhauer, Jane
Goesmann, Alexander
Hauschild, Anne-Christin
Schwengers, Oliver
Heider, Dominik
author_sort Ren, Yunxiao
collection PubMed
description MOTIVATION: Antimicrobial resistance (AMR) is one of the biggest global problems threatening human and animal health. Rapid and accurate AMR diagnostic methods are thus very urgently needed. However, traditional antimicrobial susceptibility testing (AST) is time-consuming, low throughput and viable only for cultivable bacteria. Machine learning methods may pave the way for automated AMR prediction based on genomic data of the bacteria. However, comparing different machine learning methods for the prediction of AMR based on different encodings and whole-genome sequencing data without previously known knowledge remains to be done. RESULTS: In this study, we evaluated logistic regression (LR), support vector machine (SVM), random forest (RF) and convolutional neural network (CNN) for the prediction of AMR for the antibiotics ciprofloxacin, cefotaxime, ceftazidime and gentamicin. We could demonstrate that these models can effectively predict AMR with label encoding, one-hot encoding and frequency matrix chaos game representation (FCGR encoding) on whole-genome sequencing data. We trained these models on a large AMR dataset and evaluated them on an independent public dataset. Generally, RFs and CNNs perform better than LR and SVM with AUCs up to 0.96. Furthermore, we were able to identify mutations that are associated with AMR for each antibiotic. AVAILABILITY AND IMPLEMENTATION: Source code in data preparation and model training are provided at GitHub website (https://github.com/YunxiaoRen/ML-iAMR). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-8722762
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-87227622022-01-05 Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning Ren, Yunxiao Chakraborty, Trinad Doijad, Swapnil Falgenhauer, Linda Falgenhauer, Jane Goesmann, Alexander Hauschild, Anne-Christin Schwengers, Oliver Heider, Dominik Bioinformatics Original Paper MOTIVATION: Antimicrobial resistance (AMR) is one of the biggest global problems threatening human and animal health. Rapid and accurate AMR diagnostic methods are thus very urgently needed. However, traditional antimicrobial susceptibility testing (AST) is time-consuming, low throughput and viable only for cultivable bacteria. Machine learning methods may pave the way for automated AMR prediction based on genomic data of the bacteria. However, comparing different machine learning methods for the prediction of AMR based on different encodings and whole-genome sequencing data without previously known knowledge remains to be done. RESULTS: In this study, we evaluated logistic regression (LR), support vector machine (SVM), random forest (RF) and convolutional neural network (CNN) for the prediction of AMR for the antibiotics ciprofloxacin, cefotaxime, ceftazidime and gentamicin. We could demonstrate that these models can effectively predict AMR with label encoding, one-hot encoding and frequency matrix chaos game representation (FCGR encoding) on whole-genome sequencing data. We trained these models on a large AMR dataset and evaluated them on an independent public dataset. Generally, RFs and CNNs perform better than LR and SVM with AUCs up to 0.96. Furthermore, we were able to identify mutations that are associated with AMR for each antibiotic. AVAILABILITY AND IMPLEMENTATION: Source code in data preparation and model training are provided at GitHub website (https://github.com/YunxiaoRen/ML-iAMR). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-10-06 /pmc/articles/PMC8722762/ /pubmed/34613360 http://dx.doi.org/10.1093/bioinformatics/btab681 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Original Paper
Ren, Yunxiao
Chakraborty, Trinad
Doijad, Swapnil
Falgenhauer, Linda
Falgenhauer, Jane
Goesmann, Alexander
Hauschild, Anne-Christin
Schwengers, Oliver
Heider, Dominik
Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning
title Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning
title_full Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning
title_fullStr Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning
title_full_unstemmed Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning
title_short Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning
title_sort prediction of antimicrobial resistance based on whole-genome sequencing and machine learning
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8722762/
https://www.ncbi.nlm.nih.gov/pubmed/34613360
http://dx.doi.org/10.1093/bioinformatics/btab681
work_keys_str_mv AT renyunxiao predictionofantimicrobialresistancebasedonwholegenomesequencingandmachinelearning
AT chakrabortytrinad predictionofantimicrobialresistancebasedonwholegenomesequencingandmachinelearning
AT doijadswapnil predictionofantimicrobialresistancebasedonwholegenomesequencingandmachinelearning
AT falgenhauerlinda predictionofantimicrobialresistancebasedonwholegenomesequencingandmachinelearning
AT falgenhauerjane predictionofantimicrobialresistancebasedonwholegenomesequencingandmachinelearning
AT goesmannalexander predictionofantimicrobialresistancebasedonwholegenomesequencingandmachinelearning
AT hauschildannechristin predictionofantimicrobialresistancebasedonwholegenomesequencingandmachinelearning
AT schwengersoliver predictionofantimicrobialresistancebasedonwholegenomesequencingandmachinelearning
AT heiderdominik predictionofantimicrobialresistancebasedonwholegenomesequencingandmachinelearning