Cargando…
Genome-Wide Mutation Scoring for Machine-Learning-Based Antimicrobial Resistance Prediction
The prediction of antimicrobial resistance (AMR) based on genomic information can improve patient outcomes. Genetic mechanisms have been shown to explain AMR with accuracies in line with standard microbiology laboratory testing. To translate genetic mechanisms into phenotypic AMR, machine learning h...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8657983/ https://www.ncbi.nlm.nih.gov/pubmed/34884852 http://dx.doi.org/10.3390/ijms222313049 |
_version_ | 1784612626208129024 |
---|---|
author | Májek, Peter Lüftinger, Lukas Beisken, Stephan Rattei, Thomas Materna, Arne |
author_facet | Májek, Peter Lüftinger, Lukas Beisken, Stephan Rattei, Thomas Materna, Arne |
author_sort | Májek, Peter |
collection | PubMed |
description | The prediction of antimicrobial resistance (AMR) based on genomic information can improve patient outcomes. Genetic mechanisms have been shown to explain AMR with accuracies in line with standard microbiology laboratory testing. To translate genetic mechanisms into phenotypic AMR, machine learning has been successfully applied. AMR machine learning models typically use nucleotide k-mer counts to represent genomic sequences. While k-mer representation efficiently captures sequence variation, it also results in high-dimensional and sparse data. With limited training data available, achieving acceptable model performance or model interpretability is challenging. In this study, we explore the utility of feature engineering with several biologically relevant signals. We propose to predict the functional impact of observed mutations with PROVEAN to use the predicted impact as a new feature for each protein in an organism’s proteome. The addition of the new features was tested on a total of 19,521 isolates across nine clinically relevant pathogens and 30 different antibiotics. The new features significantly improved the predictive performance of trained AMR models for Pseudomonas aeruginosa, Citrobacter freundii, and Escherichia coli. The balanced accuracy of the respective models of those three pathogens improved by 6.0% on average. |
format | Online Article Text |
id | pubmed-8657983 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-86579832021-12-10 Genome-Wide Mutation Scoring for Machine-Learning-Based Antimicrobial Resistance Prediction Májek, Peter Lüftinger, Lukas Beisken, Stephan Rattei, Thomas Materna, Arne Int J Mol Sci Article The prediction of antimicrobial resistance (AMR) based on genomic information can improve patient outcomes. Genetic mechanisms have been shown to explain AMR with accuracies in line with standard microbiology laboratory testing. To translate genetic mechanisms into phenotypic AMR, machine learning has been successfully applied. AMR machine learning models typically use nucleotide k-mer counts to represent genomic sequences. While k-mer representation efficiently captures sequence variation, it also results in high-dimensional and sparse data. With limited training data available, achieving acceptable model performance or model interpretability is challenging. In this study, we explore the utility of feature engineering with several biologically relevant signals. We propose to predict the functional impact of observed mutations with PROVEAN to use the predicted impact as a new feature for each protein in an organism’s proteome. The addition of the new features was tested on a total of 19,521 isolates across nine clinically relevant pathogens and 30 different antibiotics. The new features significantly improved the predictive performance of trained AMR models for Pseudomonas aeruginosa, Citrobacter freundii, and Escherichia coli. The balanced accuracy of the respective models of those three pathogens improved by 6.0% on average. MDPI 2021-12-02 /pmc/articles/PMC8657983/ /pubmed/34884852 http://dx.doi.org/10.3390/ijms222313049 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Májek, Peter Lüftinger, Lukas Beisken, Stephan Rattei, Thomas Materna, Arne Genome-Wide Mutation Scoring for Machine-Learning-Based Antimicrobial Resistance Prediction |
title | Genome-Wide Mutation Scoring for Machine-Learning-Based Antimicrobial Resistance Prediction |
title_full | Genome-Wide Mutation Scoring for Machine-Learning-Based Antimicrobial Resistance Prediction |
title_fullStr | Genome-Wide Mutation Scoring for Machine-Learning-Based Antimicrobial Resistance Prediction |
title_full_unstemmed | Genome-Wide Mutation Scoring for Machine-Learning-Based Antimicrobial Resistance Prediction |
title_short | Genome-Wide Mutation Scoring for Machine-Learning-Based Antimicrobial Resistance Prediction |
title_sort | genome-wide mutation scoring for machine-learning-based antimicrobial resistance prediction |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8657983/ https://www.ncbi.nlm.nih.gov/pubmed/34884852 http://dx.doi.org/10.3390/ijms222313049 |
work_keys_str_mv | AT majekpeter genomewidemutationscoringformachinelearningbasedantimicrobialresistanceprediction AT luftingerlukas genomewidemutationscoringformachinelearningbasedantimicrobialresistanceprediction AT beiskenstephan genomewidemutationscoringformachinelearningbasedantimicrobialresistanceprediction AT ratteithomas genomewidemutationscoringformachinelearningbasedantimicrobialresistanceprediction AT maternaarne genomewidemutationscoringformachinelearningbasedantimicrobialresistanceprediction |