Cargando…

Fuzzy association rule mining and classification for the prediction of malaria in South Korea

BACKGROUND: Malaria is the world’s most prevalent vector-borne disease. Accurate prediction of malaria outbreaks may lead to public health interventions that mitigate disease morbidity and mortality. METHODS: We describe an application of a method for creating prediction models utilizing Fuzzy Assoc...

Descripción completa

Detalles Bibliográficos
Autores principales: Buczak, Anna L., Baugher, Benjamin, Guven, Erhan, Ramac-Thomas, Liane C., Elbert, Yevgeniy, Babin, Steven M., Lewis, Sheri H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4472166/
https://www.ncbi.nlm.nih.gov/pubmed/26084541
http://dx.doi.org/10.1186/s12911-015-0170-6
_version_ 1782377012881522688
author Buczak, Anna L.
Baugher, Benjamin
Guven, Erhan
Ramac-Thomas, Liane C.
Elbert, Yevgeniy
Babin, Steven M.
Lewis, Sheri H.
author_facet Buczak, Anna L.
Baugher, Benjamin
Guven, Erhan
Ramac-Thomas, Liane C.
Elbert, Yevgeniy
Babin, Steven M.
Lewis, Sheri H.
author_sort Buczak, Anna L.
collection PubMed
description BACKGROUND: Malaria is the world’s most prevalent vector-borne disease. Accurate prediction of malaria outbreaks may lead to public health interventions that mitigate disease morbidity and mortality. METHODS: We describe an application of a method for creating prediction models utilizing Fuzzy Association Rule Mining to extract relationships between epidemiological, meteorological, climatic, and socio-economic data from Korea. These relationships are in the form of rules, from which the best set of rules is automatically chosen and forms a classifier. Two classifiers have been built and their results fused to become a malaria prediction model. Future malaria cases are predicted as LOW, MEDIUM or HIGH, where these classes are defined as a total of 0–2, 3–16, and above 17 cases, respectively, for a region in South Korea during a two-week period. Based on user recommendations, HIGH is considered an outbreak. RESULTS: Model accuracy is described by Positive Predictive Value (PPV), Sensitivity, and F-score for each class, computed on test data not previously used to develop the model. For predictions made 7–8 weeks in advance, model PPV and Sensitivity are 0.842 and 0.681, respectively, for the HIGH classes. The F0.5 and F3 scores (which combine PPV and Sensitivity) are 0.804 and 0.694, respectively, for the HIGH classes. The overall FARM results (as measured by F-scores) are significantly better than those obtained by Decision Tree, Random Forest, Support Vector Machine, and Holt-Winters methods for the HIGH class. For the MEDIUM class, Random Forest and FARM obtain comparable results, with FARM being better at F0.5, and Random Forest obtaining a higher F3. CONCLUSIONS: A previously described method for creating disease prediction models has been modified and extended to build models for predicting malaria. In addition, some new input variables were used, including indicators of intervention measures. The South Korea malaria prediction models predict LOW, MEDIUM or HIGH cases 7–8 weeks in the future. This paper demonstrates that our data driven approach can be used for the prediction of different diseases.
format Online
Article
Text
id pubmed-4472166
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44721662015-06-19 Fuzzy association rule mining and classification for the prediction of malaria in South Korea Buczak, Anna L. Baugher, Benjamin Guven, Erhan Ramac-Thomas, Liane C. Elbert, Yevgeniy Babin, Steven M. Lewis, Sheri H. BMC Med Inform Decis Mak Research Article BACKGROUND: Malaria is the world’s most prevalent vector-borne disease. Accurate prediction of malaria outbreaks may lead to public health interventions that mitigate disease morbidity and mortality. METHODS: We describe an application of a method for creating prediction models utilizing Fuzzy Association Rule Mining to extract relationships between epidemiological, meteorological, climatic, and socio-economic data from Korea. These relationships are in the form of rules, from which the best set of rules is automatically chosen and forms a classifier. Two classifiers have been built and their results fused to become a malaria prediction model. Future malaria cases are predicted as LOW, MEDIUM or HIGH, where these classes are defined as a total of 0–2, 3–16, and above 17 cases, respectively, for a region in South Korea during a two-week period. Based on user recommendations, HIGH is considered an outbreak. RESULTS: Model accuracy is described by Positive Predictive Value (PPV), Sensitivity, and F-score for each class, computed on test data not previously used to develop the model. For predictions made 7–8 weeks in advance, model PPV and Sensitivity are 0.842 and 0.681, respectively, for the HIGH classes. The F0.5 and F3 scores (which combine PPV and Sensitivity) are 0.804 and 0.694, respectively, for the HIGH classes. The overall FARM results (as measured by F-scores) are significantly better than those obtained by Decision Tree, Random Forest, Support Vector Machine, and Holt-Winters methods for the HIGH class. For the MEDIUM class, Random Forest and FARM obtain comparable results, with FARM being better at F0.5, and Random Forest obtaining a higher F3. CONCLUSIONS: A previously described method for creating disease prediction models has been modified and extended to build models for predicting malaria. In addition, some new input variables were used, including indicators of intervention measures. The South Korea malaria prediction models predict LOW, MEDIUM or HIGH cases 7–8 weeks in the future. This paper demonstrates that our data driven approach can be used for the prediction of different diseases. BioMed Central 2015-06-18 /pmc/articles/PMC4472166/ /pubmed/26084541 http://dx.doi.org/10.1186/s12911-015-0170-6 Text en © Buczak et al. 2015 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Buczak, Anna L.
Baugher, Benjamin
Guven, Erhan
Ramac-Thomas, Liane C.
Elbert, Yevgeniy
Babin, Steven M.
Lewis, Sheri H.
Fuzzy association rule mining and classification for the prediction of malaria in South Korea
title Fuzzy association rule mining and classification for the prediction of malaria in South Korea
title_full Fuzzy association rule mining and classification for the prediction of malaria in South Korea
title_fullStr Fuzzy association rule mining and classification for the prediction of malaria in South Korea
title_full_unstemmed Fuzzy association rule mining and classification for the prediction of malaria in South Korea
title_short Fuzzy association rule mining and classification for the prediction of malaria in South Korea
title_sort fuzzy association rule mining and classification for the prediction of malaria in south korea
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4472166/
https://www.ncbi.nlm.nih.gov/pubmed/26084541
http://dx.doi.org/10.1186/s12911-015-0170-6
work_keys_str_mv AT buczakannal fuzzyassociationruleminingandclassificationforthepredictionofmalariainsouthkorea
AT baugherbenjamin fuzzyassociationruleminingandclassificationforthepredictionofmalariainsouthkorea
AT guvenerhan fuzzyassociationruleminingandclassificationforthepredictionofmalariainsouthkorea
AT ramacthomaslianec fuzzyassociationruleminingandclassificationforthepredictionofmalariainsouthkorea
AT elbertyevgeniy fuzzyassociationruleminingandclassificationforthepredictionofmalariainsouthkorea
AT babinstevenm fuzzyassociationruleminingandclassificationforthepredictionofmalariainsouthkorea
AT lewissherih fuzzyassociationruleminingandclassificationforthepredictionofmalariainsouthkorea