Cargando…
Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model
In view of the harm of diabetes to the population, we have introduced an ensemble learning algorithm—EXtreme Gradient Boosting (XGBoost) to predict the risk of type 2 diabetes and compared it with Support Vector Machines (SVM), the Random Forest (RF) and K-Nearest Neighbor (K-NN) algorithm in order...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7551910/ https://www.ncbi.nlm.nih.gov/pubmed/32751894 http://dx.doi.org/10.3390/healthcare8030247 |
_version_ | 1783593284263739392 |
---|---|
author | Wang, Liyang Wang, Xiaoya Chen, Angxuan Jin, Xian Che, Huilian |
author_facet | Wang, Liyang Wang, Xiaoya Chen, Angxuan Jin, Xian Che, Huilian |
author_sort | Wang, Liyang |
collection | PubMed |
description | In view of the harm of diabetes to the population, we have introduced an ensemble learning algorithm—EXtreme Gradient Boosting (XGBoost) to predict the risk of type 2 diabetes and compared it with Support Vector Machines (SVM), the Random Forest (RF) and K-Nearest Neighbor (K-NN) algorithm in order to improve the prediction effect of existing models. The combination of convenient sampling and snowball sampling in Xicheng District, Beijing was used to conduct a questionnaire survey on the personal data, eating habits, exercise status and family medical history of 380 middle-aged and elderly people. Then, we trained the models and obtained the disease risk index for each sample with 10-fold cross-validation. Experiments were made to compare the commonly used machine learning algorithms mentioned above and we found that XGBoost had the best prediction effect, with an average accuracy of 0.8909 and the area under the receiver’s working characteristic curve (AUC) was 0.9182. Therefore, due to the superiority of its architecture, XGBoost has more outstanding prediction accuracy and generalization ability than existing algorithms in predicting the risk of type 2 diabetes, which is conducive to the intelligent prevention and control of diabetes in the future. |
format | Online Article Text |
id | pubmed-7551910 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-75519102020-10-14 Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model Wang, Liyang Wang, Xiaoya Chen, Angxuan Jin, Xian Che, Huilian Healthcare (Basel) Article In view of the harm of diabetes to the population, we have introduced an ensemble learning algorithm—EXtreme Gradient Boosting (XGBoost) to predict the risk of type 2 diabetes and compared it with Support Vector Machines (SVM), the Random Forest (RF) and K-Nearest Neighbor (K-NN) algorithm in order to improve the prediction effect of existing models. The combination of convenient sampling and snowball sampling in Xicheng District, Beijing was used to conduct a questionnaire survey on the personal data, eating habits, exercise status and family medical history of 380 middle-aged and elderly people. Then, we trained the models and obtained the disease risk index for each sample with 10-fold cross-validation. Experiments were made to compare the commonly used machine learning algorithms mentioned above and we found that XGBoost had the best prediction effect, with an average accuracy of 0.8909 and the area under the receiver’s working characteristic curve (AUC) was 0.9182. Therefore, due to the superiority of its architecture, XGBoost has more outstanding prediction accuracy and generalization ability than existing algorithms in predicting the risk of type 2 diabetes, which is conducive to the intelligent prevention and control of diabetes in the future. MDPI 2020-07-31 /pmc/articles/PMC7551910/ /pubmed/32751894 http://dx.doi.org/10.3390/healthcare8030247 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Wang, Liyang Wang, Xiaoya Chen, Angxuan Jin, Xian Che, Huilian Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model |
title | Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model |
title_full | Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model |
title_fullStr | Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model |
title_full_unstemmed | Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model |
title_short | Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model |
title_sort | prediction of type 2 diabetes risk and its effect evaluation based on the xgboost model |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7551910/ https://www.ncbi.nlm.nih.gov/pubmed/32751894 http://dx.doi.org/10.3390/healthcare8030247 |
work_keys_str_mv | AT wangliyang predictionoftype2diabetesriskanditseffectevaluationbasedonthexgboostmodel AT wangxiaoya predictionoftype2diabetesriskanditseffectevaluationbasedonthexgboostmodel AT chenangxuan predictionoftype2diabetesriskanditseffectevaluationbasedonthexgboostmodel AT jinxian predictionoftype2diabetesriskanditseffectevaluationbasedonthexgboostmodel AT chehuilian predictionoftype2diabetesriskanditseffectevaluationbasedonthexgboostmodel |