Cargando…
Epidemiological Data Mining for Assisting with Foodborne Outbreak Investigation
Diseases caused by the consumption of food are a significant but avoidable public health issue, and identifying the source of contamination is a key step in an outbreak investigation to prevent foodborne illnesses. Historical foodborne outbreaks provide rich data on critical attributes such as outbr...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10606626/ https://www.ncbi.nlm.nih.gov/pubmed/37893718 http://dx.doi.org/10.3390/foods12203825 |
_version_ | 1785127361182695424 |
---|---|
author | Tao, Dandan Zhang, Dongyu Hu, Ruofan Rundensteiner, Elke Feng, Hao |
author_facet | Tao, Dandan Zhang, Dongyu Hu, Ruofan Rundensteiner, Elke Feng, Hao |
author_sort | Tao, Dandan |
collection | PubMed |
description | Diseases caused by the consumption of food are a significant but avoidable public health issue, and identifying the source of contamination is a key step in an outbreak investigation to prevent foodborne illnesses. Historical foodborne outbreaks provide rich data on critical attributes such as outbreak factors, food vehicles, and etiologies, and an improved understanding of the relationships between these attributes could provide insights for developing effective food safety interventions. The purpose of this study was to identify hidden patterns underlying the relations between the critical attributes involved in historical foodborne outbreaks through data mining approaches. A statistical analysis was used to identify the associations between outbreak factors and food sources, and the factors that were strongly significant were selected as predictive factors for food vehicles. A multinomial prediction model was built based on factors selected for predicting “simple” foods (beef, dairy, and vegetables) as sources of outbreaks. In addition, the relations between the food vehicles and common etiologies were investigated through text mining approaches (support vector machines, logistic regression, random forest, and naïve Bayes). A support vector machine model was identified as the optimal model to predict etiologies from the occurrence of food vehicles. Association rules also indicated the specific food vehicles that have strong relations to the etiologies. Meanwhile, a food ingredient network describing the relationships between foods and ingredients was constructed and used with Monte Carlo simulation to predict possible ingredients from foods that cause an outbreak. The simulated results were confirmed with foods and ingredients that are already known to cause historical foodborne outbreaks. The method could provide insights into the prediction of the possible ingredient sources of contamination when given the name of a food. The results could provide insights into the early identification of food sources of contamination and assist in future outbreak investigations. The data-driven approach will provide a new perspective and strategies for discovering hidden knowledge from massive data. |
format | Online Article Text |
id | pubmed-10606626 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-106066262023-10-28 Epidemiological Data Mining for Assisting with Foodborne Outbreak Investigation Tao, Dandan Zhang, Dongyu Hu, Ruofan Rundensteiner, Elke Feng, Hao Foods Article Diseases caused by the consumption of food are a significant but avoidable public health issue, and identifying the source of contamination is a key step in an outbreak investigation to prevent foodborne illnesses. Historical foodborne outbreaks provide rich data on critical attributes such as outbreak factors, food vehicles, and etiologies, and an improved understanding of the relationships between these attributes could provide insights for developing effective food safety interventions. The purpose of this study was to identify hidden patterns underlying the relations between the critical attributes involved in historical foodborne outbreaks through data mining approaches. A statistical analysis was used to identify the associations between outbreak factors and food sources, and the factors that were strongly significant were selected as predictive factors for food vehicles. A multinomial prediction model was built based on factors selected for predicting “simple” foods (beef, dairy, and vegetables) as sources of outbreaks. In addition, the relations between the food vehicles and common etiologies were investigated through text mining approaches (support vector machines, logistic regression, random forest, and naïve Bayes). A support vector machine model was identified as the optimal model to predict etiologies from the occurrence of food vehicles. Association rules also indicated the specific food vehicles that have strong relations to the etiologies. Meanwhile, a food ingredient network describing the relationships between foods and ingredients was constructed and used with Monte Carlo simulation to predict possible ingredients from foods that cause an outbreak. The simulated results were confirmed with foods and ingredients that are already known to cause historical foodborne outbreaks. The method could provide insights into the prediction of the possible ingredient sources of contamination when given the name of a food. The results could provide insights into the early identification of food sources of contamination and assist in future outbreak investigations. The data-driven approach will provide a new perspective and strategies for discovering hidden knowledge from massive data. MDPI 2023-10-19 /pmc/articles/PMC10606626/ /pubmed/37893718 http://dx.doi.org/10.3390/foods12203825 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Tao, Dandan Zhang, Dongyu Hu, Ruofan Rundensteiner, Elke Feng, Hao Epidemiological Data Mining for Assisting with Foodborne Outbreak Investigation |
title | Epidemiological Data Mining for Assisting with Foodborne Outbreak Investigation |
title_full | Epidemiological Data Mining for Assisting with Foodborne Outbreak Investigation |
title_fullStr | Epidemiological Data Mining for Assisting with Foodborne Outbreak Investigation |
title_full_unstemmed | Epidemiological Data Mining for Assisting with Foodborne Outbreak Investigation |
title_short | Epidemiological Data Mining for Assisting with Foodborne Outbreak Investigation |
title_sort | epidemiological data mining for assisting with foodborne outbreak investigation |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10606626/ https://www.ncbi.nlm.nih.gov/pubmed/37893718 http://dx.doi.org/10.3390/foods12203825 |
work_keys_str_mv | AT taodandan epidemiologicaldataminingforassistingwithfoodborneoutbreakinvestigation AT zhangdongyu epidemiologicaldataminingforassistingwithfoodborneoutbreakinvestigation AT huruofan epidemiologicaldataminingforassistingwithfoodborneoutbreakinvestigation AT rundensteinerelke epidemiologicaldataminingforassistingwithfoodborneoutbreakinvestigation AT fenghao epidemiologicaldataminingforassistingwithfoodborneoutbreakinvestigation |