Cargando…

Improving Machine Learning Diabetes Prediction Models for the Utmost Clinical Effectiveness

The early prediction of diabetes can facilitate interventions to prevent or delay it. This study proposes a diabetes prediction model based on machine learning (ML) to encourage individuals at risk of diabetes to employ healthy interventions. A total of 38,379 subjects were included. We trained the...

Descripción completa

Detalles Bibliográficos
Autores principales: Shin, Juyoung, Lee, Joonyub, Ko, Taehoon, Lee, Kanghyuck, Choi, Yera, Kim, Hun-Sung
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9698354/
https://www.ncbi.nlm.nih.gov/pubmed/36422075
http://dx.doi.org/10.3390/jpm12111899
_version_ 1784838795188764672
author Shin, Juyoung
Lee, Joonyub
Ko, Taehoon
Lee, Kanghyuck
Choi, Yera
Kim, Hun-Sung
author_facet Shin, Juyoung
Lee, Joonyub
Ko, Taehoon
Lee, Kanghyuck
Choi, Yera
Kim, Hun-Sung
author_sort Shin, Juyoung
collection PubMed
description The early prediction of diabetes can facilitate interventions to prevent or delay it. This study proposes a diabetes prediction model based on machine learning (ML) to encourage individuals at risk of diabetes to employ healthy interventions. A total of 38,379 subjects were included. We trained the model on 80% of the subjects and verified its predictive performance on the remaining 20%. Furthermore, the performances of several algorithms were compared, including logistic regression, decision tree, random forest, eXtreme Gradient Boosting (XGBoost), Cox regression, and XGBoost Survival Embedding (XGBSE). The area under the receiver operating characteristic curve (AUROC) of the XGBoost model was the largest, followed by those of the decision tree, logistic regression, and random forest models. For the survival analysis, XGBSE yielded an AUROC exceeding 0.9 for the 2- to 9-year predictions and a C-index of 0.934, while the Cox regression achieved a C-index of 0.921. After lowering the threshold from 0.5 to 0.25, the sensitivity increased from 0.011 to 0.236 for the 2-year prediction model and from 0.607 to 0.994 for the 9-year prediction model, while the specificity showed negligible changes. We developed a high-performance diabetes prediction model that applied the XGBSE algorithm with threshold adjustment. We plan to use this prediction model in real clinical practice for diabetes prevention after simplifying and validating it externally.
format Online
Article
Text
id pubmed-9698354
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-96983542022-11-26 Improving Machine Learning Diabetes Prediction Models for the Utmost Clinical Effectiveness Shin, Juyoung Lee, Joonyub Ko, Taehoon Lee, Kanghyuck Choi, Yera Kim, Hun-Sung J Pers Med Article The early prediction of diabetes can facilitate interventions to prevent or delay it. This study proposes a diabetes prediction model based on machine learning (ML) to encourage individuals at risk of diabetes to employ healthy interventions. A total of 38,379 subjects were included. We trained the model on 80% of the subjects and verified its predictive performance on the remaining 20%. Furthermore, the performances of several algorithms were compared, including logistic regression, decision tree, random forest, eXtreme Gradient Boosting (XGBoost), Cox regression, and XGBoost Survival Embedding (XGBSE). The area under the receiver operating characteristic curve (AUROC) of the XGBoost model was the largest, followed by those of the decision tree, logistic regression, and random forest models. For the survival analysis, XGBSE yielded an AUROC exceeding 0.9 for the 2- to 9-year predictions and a C-index of 0.934, while the Cox regression achieved a C-index of 0.921. After lowering the threshold from 0.5 to 0.25, the sensitivity increased from 0.011 to 0.236 for the 2-year prediction model and from 0.607 to 0.994 for the 9-year prediction model, while the specificity showed negligible changes. We developed a high-performance diabetes prediction model that applied the XGBSE algorithm with threshold adjustment. We plan to use this prediction model in real clinical practice for diabetes prevention after simplifying and validating it externally. MDPI 2022-11-14 /pmc/articles/PMC9698354/ /pubmed/36422075 http://dx.doi.org/10.3390/jpm12111899 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Shin, Juyoung
Lee, Joonyub
Ko, Taehoon
Lee, Kanghyuck
Choi, Yera
Kim, Hun-Sung
Improving Machine Learning Diabetes Prediction Models for the Utmost Clinical Effectiveness
title Improving Machine Learning Diabetes Prediction Models for the Utmost Clinical Effectiveness
title_full Improving Machine Learning Diabetes Prediction Models for the Utmost Clinical Effectiveness
title_fullStr Improving Machine Learning Diabetes Prediction Models for the Utmost Clinical Effectiveness
title_full_unstemmed Improving Machine Learning Diabetes Prediction Models for the Utmost Clinical Effectiveness
title_short Improving Machine Learning Diabetes Prediction Models for the Utmost Clinical Effectiveness
title_sort improving machine learning diabetes prediction models for the utmost clinical effectiveness
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9698354/
https://www.ncbi.nlm.nih.gov/pubmed/36422075
http://dx.doi.org/10.3390/jpm12111899
work_keys_str_mv AT shinjuyoung improvingmachinelearningdiabetespredictionmodelsfortheutmostclinicaleffectiveness
AT leejoonyub improvingmachinelearningdiabetespredictionmodelsfortheutmostclinicaleffectiveness
AT kotaehoon improvingmachinelearningdiabetespredictionmodelsfortheutmostclinicaleffectiveness
AT leekanghyuck improvingmachinelearningdiabetespredictionmodelsfortheutmostclinicaleffectiveness
AT choiyera improvingmachinelearningdiabetespredictionmodelsfortheutmostclinicaleffectiveness
AT kimhunsung improvingmachinelearningdiabetespredictionmodelsfortheutmostclinicaleffectiveness