Cargando…

Prediction of Type 2 Diabetes Based on Machine Learning Algorithm

Prediction of type 2 diabetes (T2D) occurrence allows a person at risk to take actions that can prevent onset or delay the progression of the disease. In this study, we developed a machine learning (ML) model to predict T2D occurrence in the following year (Y + 1) using variables in the current year...

Descripción completa

Detalles Bibliográficos
Autores principales:	Deberneh, Henock M., Kim, Intaek
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8004981/ https://www.ncbi.nlm.nih.gov/pubmed/33806973 http://dx.doi.org/10.3390/ijerph18063317

_version_	1783672028485648384
author	Deberneh, Henock M. Kim, Intaek
author_facet	Deberneh, Henock M. Kim, Intaek
author_sort	Deberneh, Henock M.
collection	PubMed
description	Prediction of type 2 diabetes (T2D) occurrence allows a person at risk to take actions that can prevent onset or delay the progression of the disease. In this study, we developed a machine learning (ML) model to predict T2D occurrence in the following year (Y + 1) using variables in the current year (Y). The dataset for this study was collected at a private medical institute as electronic health records from 2013 to 2018. To construct the prediction model, key features were first selected using ANOVA tests, chi-squared tests, and recursive feature elimination methods. The resultant features were fasting plasma glucose (FPG), HbA1c, triglycerides, BMI, gamma-GTP, age, uric acid, sex, smoking, drinking, physical activity, and family history. We then employed logistic regression, random forest, support vector machine, XGBoost, and ensemble machine learning algorithms based on these variables to predict the outcome as normal (non-diabetic), prediabetes, or diabetes. Based on the experimental results, the performance of the prediction model proved to be reasonably good at forecasting the occurrence of T2D in the Korean population. The model can provide clinicians and patients with valuable predictive information on the likelihood of developing T2D. The cross-validation (CV) results showed that the ensemble models had a superior performance to that of the single models. The CV performance of the prediction models was improved by incorporating more medical history from the dataset.
format	Online Article Text
id	pubmed-8004981
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-80049812021-03-29 Prediction of Type 2 Diabetes Based on Machine Learning Algorithm Deberneh, Henock M. Kim, Intaek Int J Environ Res Public Health Article Prediction of type 2 diabetes (T2D) occurrence allows a person at risk to take actions that can prevent onset or delay the progression of the disease. In this study, we developed a machine learning (ML) model to predict T2D occurrence in the following year (Y + 1) using variables in the current year (Y). The dataset for this study was collected at a private medical institute as electronic health records from 2013 to 2018. To construct the prediction model, key features were first selected using ANOVA tests, chi-squared tests, and recursive feature elimination methods. The resultant features were fasting plasma glucose (FPG), HbA1c, triglycerides, BMI, gamma-GTP, age, uric acid, sex, smoking, drinking, physical activity, and family history. We then employed logistic regression, random forest, support vector machine, XGBoost, and ensemble machine learning algorithms based on these variables to predict the outcome as normal (non-diabetic), prediabetes, or diabetes. Based on the experimental results, the performance of the prediction model proved to be reasonably good at forecasting the occurrence of T2D in the Korean population. The model can provide clinicians and patients with valuable predictive information on the likelihood of developing T2D. The cross-validation (CV) results showed that the ensemble models had a superior performance to that of the single models. The CV performance of the prediction models was improved by incorporating more medical history from the dataset. MDPI 2021-03-23 /pmc/articles/PMC8004981/ /pubmed/33806973 http://dx.doi.org/10.3390/ijerph18063317 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Deberneh, Henock M. Kim, Intaek Prediction of Type 2 Diabetes Based on Machine Learning Algorithm
title	Prediction of Type 2 Diabetes Based on Machine Learning Algorithm
title_full	Prediction of Type 2 Diabetes Based on Machine Learning Algorithm
title_fullStr	Prediction of Type 2 Diabetes Based on Machine Learning Algorithm
title_full_unstemmed	Prediction of Type 2 Diabetes Based on Machine Learning Algorithm
title_short	Prediction of Type 2 Diabetes Based on Machine Learning Algorithm
title_sort	prediction of type 2 diabetes based on machine learning algorithm
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8004981/ https://www.ncbi.nlm.nih.gov/pubmed/33806973 http://dx.doi.org/10.3390/ijerph18063317
work_keys_str_mv	AT debernehhenockm predictionoftype2diabetesbasedonmachinelearningalgorithm AT kimintaek predictionoftype2diabetesbasedonmachinelearningalgorithm

Prediction of Type 2 Diabetes Based on Machine Learning Algorithm

Ejemplares similares