Cargando…
Machine Learning Prediction of Foodborne Disease Pathogens: Algorithm Development and Validation Study
BACKGROUND: Foodborne diseases, as a type of disease with a high global incidence, place a heavy burden on public health and social economy. Foodborne pathogens, as the main factor of foodborne diseases, play an important role in the treatment and prevention of foodborne diseases; however, foodborne...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7872834/ https://www.ncbi.nlm.nih.gov/pubmed/33496675 http://dx.doi.org/10.2196/24924 |
_version_ | 1783649264448045056 |
---|---|
author | Wang, Hanxue Cui, Wenjuan Guo, Yunchang Du, Yi Zhou, Yuanchun |
author_facet | Wang, Hanxue Cui, Wenjuan Guo, Yunchang Du, Yi Zhou, Yuanchun |
author_sort | Wang, Hanxue |
collection | PubMed |
description | BACKGROUND: Foodborne diseases, as a type of disease with a high global incidence, place a heavy burden on public health and social economy. Foodborne pathogens, as the main factor of foodborne diseases, play an important role in the treatment and prevention of foodborne diseases; however, foodborne diseases caused by different pathogens lack specificity in clinical features, and there is a low proportion of clinically actual pathogen detection in real life. OBJECTIVE: We aimed to analyze foodborne disease case data, select appropriate features based on analysis results, and use machine learning methods to classify foodborne disease pathogens to predict foodborne disease pathogens that have not been tested. METHODS: We extracted features such as space, time, and exposed food from foodborne disease case data and analyzed the relationship between these features and the foodborne disease pathogens using a variety of machine learning methods to classify foodborne disease pathogens. We compared the results of 4 models to obtain the pathogen prediction model with the highest accuracy. RESULTS: The gradient boost decision tree model obtained the highest accuracy, with accuracy approaching 69% in identifying 4 pathogens including Salmonella, Norovirus, Escherichia coli, and Vibrio parahaemolyticus. By evaluating the importance of features such as time of illness, geographical longitude and latitude, and diarrhea frequency, we found that they play important roles in classifying the foodborne disease pathogens. CONCLUSIONS: Data analysis can reflect the distribution of some features of foodborne diseases and the relationship among the features. The classification of pathogens based on the analysis results and machine learning methods can provide beneficial support for clinical auxiliary diagnosis and treatment of foodborne diseases. |
format | Online Article Text |
id | pubmed-7872834 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-78728342021-02-22 Machine Learning Prediction of Foodborne Disease Pathogens: Algorithm Development and Validation Study Wang, Hanxue Cui, Wenjuan Guo, Yunchang Du, Yi Zhou, Yuanchun JMIR Med Inform Original Paper BACKGROUND: Foodborne diseases, as a type of disease with a high global incidence, place a heavy burden on public health and social economy. Foodborne pathogens, as the main factor of foodborne diseases, play an important role in the treatment and prevention of foodborne diseases; however, foodborne diseases caused by different pathogens lack specificity in clinical features, and there is a low proportion of clinically actual pathogen detection in real life. OBJECTIVE: We aimed to analyze foodborne disease case data, select appropriate features based on analysis results, and use machine learning methods to classify foodborne disease pathogens to predict foodborne disease pathogens that have not been tested. METHODS: We extracted features such as space, time, and exposed food from foodborne disease case data and analyzed the relationship between these features and the foodborne disease pathogens using a variety of machine learning methods to classify foodborne disease pathogens. We compared the results of 4 models to obtain the pathogen prediction model with the highest accuracy. RESULTS: The gradient boost decision tree model obtained the highest accuracy, with accuracy approaching 69% in identifying 4 pathogens including Salmonella, Norovirus, Escherichia coli, and Vibrio parahaemolyticus. By evaluating the importance of features such as time of illness, geographical longitude and latitude, and diarrhea frequency, we found that they play important roles in classifying the foodborne disease pathogens. CONCLUSIONS: Data analysis can reflect the distribution of some features of foodborne diseases and the relationship among the features. The classification of pathogens based on the analysis results and machine learning methods can provide beneficial support for clinical auxiliary diagnosis and treatment of foodborne diseases. JMIR Publications 2021-01-26 /pmc/articles/PMC7872834/ /pubmed/33496675 http://dx.doi.org/10.2196/24924 Text en ©Hanxue Wang, Wenjuan Cui, Yunchang Guo, Yi Du, Yuanchun Zhou. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 26.01.2021. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Wang, Hanxue Cui, Wenjuan Guo, Yunchang Du, Yi Zhou, Yuanchun Machine Learning Prediction of Foodborne Disease Pathogens: Algorithm Development and Validation Study |
title | Machine Learning Prediction of Foodborne Disease Pathogens: Algorithm Development and Validation Study |
title_full | Machine Learning Prediction of Foodborne Disease Pathogens: Algorithm Development and Validation Study |
title_fullStr | Machine Learning Prediction of Foodborne Disease Pathogens: Algorithm Development and Validation Study |
title_full_unstemmed | Machine Learning Prediction of Foodborne Disease Pathogens: Algorithm Development and Validation Study |
title_short | Machine Learning Prediction of Foodborne Disease Pathogens: Algorithm Development and Validation Study |
title_sort | machine learning prediction of foodborne disease pathogens: algorithm development and validation study |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7872834/ https://www.ncbi.nlm.nih.gov/pubmed/33496675 http://dx.doi.org/10.2196/24924 |
work_keys_str_mv | AT wanghanxue machinelearningpredictionoffoodbornediseasepathogensalgorithmdevelopmentandvalidationstudy AT cuiwenjuan machinelearningpredictionoffoodbornediseasepathogensalgorithmdevelopmentandvalidationstudy AT guoyunchang machinelearningpredictionoffoodbornediseasepathogensalgorithmdevelopmentandvalidationstudy AT duyi machinelearningpredictionoffoodbornediseasepathogensalgorithmdevelopmentandvalidationstudy AT zhouyuanchun machinelearningpredictionoffoodbornediseasepathogensalgorithmdevelopmentandvalidationstudy |