Cargando…

Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model

In view of the harm of diabetes to the population, we have introduced an ensemble learning algorithm—EXtreme Gradient Boosting (XGBoost) to predict the risk of type 2 diabetes and compared it with Support Vector Machines (SVM), the Random Forest (RF) and K-Nearest Neighbor (K-NN) algorithm in order...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Liyang, Wang, Xiaoya, Chen, Angxuan, Jin, Xian, Che, Huilian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7551910/
https://www.ncbi.nlm.nih.gov/pubmed/32751894
http://dx.doi.org/10.3390/healthcare8030247
_version_ 1783593284263739392
author Wang, Liyang
Wang, Xiaoya
Chen, Angxuan
Jin, Xian
Che, Huilian
author_facet Wang, Liyang
Wang, Xiaoya
Chen, Angxuan
Jin, Xian
Che, Huilian
author_sort Wang, Liyang
collection PubMed
description In view of the harm of diabetes to the population, we have introduced an ensemble learning algorithm—EXtreme Gradient Boosting (XGBoost) to predict the risk of type 2 diabetes and compared it with Support Vector Machines (SVM), the Random Forest (RF) and K-Nearest Neighbor (K-NN) algorithm in order to improve the prediction effect of existing models. The combination of convenient sampling and snowball sampling in Xicheng District, Beijing was used to conduct a questionnaire survey on the personal data, eating habits, exercise status and family medical history of 380 middle-aged and elderly people. Then, we trained the models and obtained the disease risk index for each sample with 10-fold cross-validation. Experiments were made to compare the commonly used machine learning algorithms mentioned above and we found that XGBoost had the best prediction effect, with an average accuracy of 0.8909 and the area under the receiver’s working characteristic curve (AUC) was 0.9182. Therefore, due to the superiority of its architecture, XGBoost has more outstanding prediction accuracy and generalization ability than existing algorithms in predicting the risk of type 2 diabetes, which is conducive to the intelligent prevention and control of diabetes in the future.
format Online
Article
Text
id pubmed-7551910
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-75519102020-10-14 Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model Wang, Liyang Wang, Xiaoya Chen, Angxuan Jin, Xian Che, Huilian Healthcare (Basel) Article In view of the harm of diabetes to the population, we have introduced an ensemble learning algorithm—EXtreme Gradient Boosting (XGBoost) to predict the risk of type 2 diabetes and compared it with Support Vector Machines (SVM), the Random Forest (RF) and K-Nearest Neighbor (K-NN) algorithm in order to improve the prediction effect of existing models. The combination of convenient sampling and snowball sampling in Xicheng District, Beijing was used to conduct a questionnaire survey on the personal data, eating habits, exercise status and family medical history of 380 middle-aged and elderly people. Then, we trained the models and obtained the disease risk index for each sample with 10-fold cross-validation. Experiments were made to compare the commonly used machine learning algorithms mentioned above and we found that XGBoost had the best prediction effect, with an average accuracy of 0.8909 and the area under the receiver’s working characteristic curve (AUC) was 0.9182. Therefore, due to the superiority of its architecture, XGBoost has more outstanding prediction accuracy and generalization ability than existing algorithms in predicting the risk of type 2 diabetes, which is conducive to the intelligent prevention and control of diabetes in the future. MDPI 2020-07-31 /pmc/articles/PMC7551910/ /pubmed/32751894 http://dx.doi.org/10.3390/healthcare8030247 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Wang, Liyang
Wang, Xiaoya
Chen, Angxuan
Jin, Xian
Che, Huilian
Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model
title Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model
title_full Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model
title_fullStr Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model
title_full_unstemmed Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model
title_short Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model
title_sort prediction of type 2 diabetes risk and its effect evaluation based on the xgboost model
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7551910/
https://www.ncbi.nlm.nih.gov/pubmed/32751894
http://dx.doi.org/10.3390/healthcare8030247
work_keys_str_mv AT wangliyang predictionoftype2diabetesriskanditseffectevaluationbasedonthexgboostmodel
AT wangxiaoya predictionoftype2diabetesriskanditseffectevaluationbasedonthexgboostmodel
AT chenangxuan predictionoftype2diabetesriskanditseffectevaluationbasedonthexgboostmodel
AT jinxian predictionoftype2diabetesriskanditseffectevaluationbasedonthexgboostmodel
AT chehuilian predictionoftype2diabetesriskanditseffectevaluationbasedonthexgboostmodel