Cargando…

Using random forest to predict antimicrobial minimum inhibitory concentrations of nontyphoidal Salmonella in Taiwan

Antimicrobial resistance (AMR) is a global health issue and surveillance of AMR can be useful for understanding AMR trends and planning intervention strategies. Salmonella, widely distributed in food-producing animals, has been considered the first priority for inclusion in the AMR surveillance prog...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Chia-Chi, Hung, Yu-Ting, Chou, Che-Yu, Hsuan, Shih-Ling, Chen, Zeng-Weng, Chang, Pei-Yu, Jan, Tong-Rong, Tung, Chun-Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9903507/
https://www.ncbi.nlm.nih.gov/pubmed/36747286
http://dx.doi.org/10.1186/s13567-023-01141-5
Descripción
Sumario:Antimicrobial resistance (AMR) is a global health issue and surveillance of AMR can be useful for understanding AMR trends and planning intervention strategies. Salmonella, widely distributed in food-producing animals, has been considered the first priority for inclusion in the AMR surveillance program by the World Health Organization (WHO). Recent advances in rapid and affordable whole-genome sequencing (WGS) techniques lead to the emergence of WGS as a one-stop test to predict the antimicrobial susceptibility. Since the variation of sequencing and minimum inhibitory concentration (MIC) measurement methods could result in different results, this study aimed to develop WGS-based random forest models for predicting MIC values of 24 drugs using data generated from the same laboratories in Taiwan. The WGS data have been transformed as a feature vector of 10-mers for machine learning. Based on rigorous validation and independent tests, a good performance was obtained with an average mean absolute error (MAE) less than 1 for both validation and independent test. Feature selection was then applied to identify top-ranked 10-mers that can further improve the prediction performance. For surveillance purposes, the genome sequence-based machine learning methods could be utilized to monitor the difference between predicted and experimental MIC, where a large difference might be worthy of investigation on the emerging genomic determinants. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13567-023-01141-5.