Cargando…

Comparison of three data mining models for prediction of advanced schistosomiasis prognosis in the Hubei province

BACKGROUND: In order to better assist medical professionals, this study aimed to develop and compare the performance of three models—a multivariate logistic regression (LR) model, an artificial neural network (ANN) model, and a decision tree (DT) model—to predict the prognosis of patients with advan...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Guo, Zhou, Xiaorong, Liu, Jianbing, Chen, Yuanqi, Zhang, Hengtao, Chen, Yanyan, Liu, Jianhua, Jiang, Hongbo, Yang, Junjing, Nie, Shaofa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5831639/
https://www.ncbi.nlm.nih.gov/pubmed/29447165
http://dx.doi.org/10.1371/journal.pntd.0006262
_version_ 1783303175194804224
author Li, Guo
Zhou, Xiaorong
Liu, Jianbing
Chen, Yuanqi
Zhang, Hengtao
Chen, Yanyan
Liu, Jianhua
Jiang, Hongbo
Yang, Junjing
Nie, Shaofa
author_facet Li, Guo
Zhou, Xiaorong
Liu, Jianbing
Chen, Yuanqi
Zhang, Hengtao
Chen, Yanyan
Liu, Jianhua
Jiang, Hongbo
Yang, Junjing
Nie, Shaofa
author_sort Li, Guo
collection PubMed
description BACKGROUND: In order to better assist medical professionals, this study aimed to develop and compare the performance of three models—a multivariate logistic regression (LR) model, an artificial neural network (ANN) model, and a decision tree (DT) model—to predict the prognosis of patients with advanced schistosomiasis residing in the Hubei province. METHODOLOGY/PRINCIPAL FINDINGS: Schistosomiasis surveillance data were collected from a previous study based on a Hubei population sample including 4136 advanced schistosomiasis cases. The predictive models use LR, ANN, and DT methods. From each of the three groups, 70% of the cases (2896 cases) were used as training data for the predictive models. The remaining 30% of the cases (1240 cases) were used as validation groups for performance comparisons between the three models. Prediction performance was evaluated using area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and accuracy. Univariate analysis indicated that 16 risk factors were significantly associated with a patient’s outcome of prognosis. In the training group, the mean AUC was 0.8276 for LR, 0.9267 for ANN, and 0.8229 for DT. In the validation group, the mean AUC was 0.8349 for LR, 0.8318 for ANN, and 0.8148 for DT. The three models yielded similar results in terms of accuracy, sensitivity, and specificity. CONCLUSIONS/SIGNIFICANCE: Predictive models for advanced schistosomiasis prognosis, respectively using LR, ANN and DT models were proved to be effective approaches based on our dataset. The ANN model outperformed the LR and DT models in terms of AUC.
format Online
Article
Text
id pubmed-5831639
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-58316392018-03-15 Comparison of three data mining models for prediction of advanced schistosomiasis prognosis in the Hubei province Li, Guo Zhou, Xiaorong Liu, Jianbing Chen, Yuanqi Zhang, Hengtao Chen, Yanyan Liu, Jianhua Jiang, Hongbo Yang, Junjing Nie, Shaofa PLoS Negl Trop Dis Research Article BACKGROUND: In order to better assist medical professionals, this study aimed to develop and compare the performance of three models—a multivariate logistic regression (LR) model, an artificial neural network (ANN) model, and a decision tree (DT) model—to predict the prognosis of patients with advanced schistosomiasis residing in the Hubei province. METHODOLOGY/PRINCIPAL FINDINGS: Schistosomiasis surveillance data were collected from a previous study based on a Hubei population sample including 4136 advanced schistosomiasis cases. The predictive models use LR, ANN, and DT methods. From each of the three groups, 70% of the cases (2896 cases) were used as training data for the predictive models. The remaining 30% of the cases (1240 cases) were used as validation groups for performance comparisons between the three models. Prediction performance was evaluated using area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and accuracy. Univariate analysis indicated that 16 risk factors were significantly associated with a patient’s outcome of prognosis. In the training group, the mean AUC was 0.8276 for LR, 0.9267 for ANN, and 0.8229 for DT. In the validation group, the mean AUC was 0.8349 for LR, 0.8318 for ANN, and 0.8148 for DT. The three models yielded similar results in terms of accuracy, sensitivity, and specificity. CONCLUSIONS/SIGNIFICANCE: Predictive models for advanced schistosomiasis prognosis, respectively using LR, ANN and DT models were proved to be effective approaches based on our dataset. The ANN model outperformed the LR and DT models in terms of AUC. Public Library of Science 2018-02-15 /pmc/articles/PMC5831639/ /pubmed/29447165 http://dx.doi.org/10.1371/journal.pntd.0006262 Text en © 2018 Li et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Li, Guo
Zhou, Xiaorong
Liu, Jianbing
Chen, Yuanqi
Zhang, Hengtao
Chen, Yanyan
Liu, Jianhua
Jiang, Hongbo
Yang, Junjing
Nie, Shaofa
Comparison of three data mining models for prediction of advanced schistosomiasis prognosis in the Hubei province
title Comparison of three data mining models for prediction of advanced schistosomiasis prognosis in the Hubei province
title_full Comparison of three data mining models for prediction of advanced schistosomiasis prognosis in the Hubei province
title_fullStr Comparison of three data mining models for prediction of advanced schistosomiasis prognosis in the Hubei province
title_full_unstemmed Comparison of three data mining models for prediction of advanced schistosomiasis prognosis in the Hubei province
title_short Comparison of three data mining models for prediction of advanced schistosomiasis prognosis in the Hubei province
title_sort comparison of three data mining models for prediction of advanced schistosomiasis prognosis in the hubei province
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5831639/
https://www.ncbi.nlm.nih.gov/pubmed/29447165
http://dx.doi.org/10.1371/journal.pntd.0006262
work_keys_str_mv AT liguo comparisonofthreedataminingmodelsforpredictionofadvancedschistosomiasisprognosisinthehubeiprovince
AT zhouxiaorong comparisonofthreedataminingmodelsforpredictionofadvancedschistosomiasisprognosisinthehubeiprovince
AT liujianbing comparisonofthreedataminingmodelsforpredictionofadvancedschistosomiasisprognosisinthehubeiprovince
AT chenyuanqi comparisonofthreedataminingmodelsforpredictionofadvancedschistosomiasisprognosisinthehubeiprovince
AT zhanghengtao comparisonofthreedataminingmodelsforpredictionofadvancedschistosomiasisprognosisinthehubeiprovince
AT chenyanyan comparisonofthreedataminingmodelsforpredictionofadvancedschistosomiasisprognosisinthehubeiprovince
AT liujianhua comparisonofthreedataminingmodelsforpredictionofadvancedschistosomiasisprognosisinthehubeiprovince
AT jianghongbo comparisonofthreedataminingmodelsforpredictionofadvancedschistosomiasisprognosisinthehubeiprovince
AT yangjunjing comparisonofthreedataminingmodelsforpredictionofadvancedschistosomiasisprognosisinthehubeiprovince
AT nieshaofa comparisonofthreedataminingmodelsforpredictionofadvancedschistosomiasisprognosisinthehubeiprovince