Cargando…

Building Risk Prediction Models for Type 2 Diabetes Using Machine Learning Techniques

INTRODUCTION: As one of the most prevalent chronic diseases in the United States, diabetes, especially type 2 diabetes, affects the health of millions of people and puts an enormous financial burden on the US economy. We aimed to develop predictive models to identify risk factors for type 2 diabetes...

Descripción completa

Detalles Bibliográficos
Autores principales: Xie, Zidian, Nikolayeva, Olga, Luo, Jiebo, Li, Dongmei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Centers for Disease Control and Prevention 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6795062/
https://www.ncbi.nlm.nih.gov/pubmed/31538566
http://dx.doi.org/10.5888/pcd16.190109
_version_ 1783459412672774144
author Xie, Zidian
Nikolayeva, Olga
Luo, Jiebo
Li, Dongmei
author_facet Xie, Zidian
Nikolayeva, Olga
Luo, Jiebo
Li, Dongmei
author_sort Xie, Zidian
collection PubMed
description INTRODUCTION: As one of the most prevalent chronic diseases in the United States, diabetes, especially type 2 diabetes, affects the health of millions of people and puts an enormous financial burden on the US economy. We aimed to develop predictive models to identify risk factors for type 2 diabetes, which could help facilitate early diagnosis and intervention and also reduce medical costs. METHODS: We analyzed cross-sectional data on 138,146 participants, including 20,467 with type 2 diabetes, from the 2014 Behavioral Risk Factor Surveillance System. We built several machine learning models for predicting type 2 diabetes, including support vector machine, decision tree, logistic regression, random forest, neural network, and Gaussian Naive Bayes classifiers. We used univariable and multivariable weighted logistic regression models to investigate the associations of potential risk factors with type 2 diabetes. RESULTS: All predictive models for type 2 diabetes achieved a high area under the curve (AUC), ranging from 0.7182 to 0.7949. Although the neural network model had the highest accuracy (82.4%), specificity (90.2%), and AUC (0.7949), the decision tree model had the highest sensitivity (51.6%) for type 2 diabetes. We found that people who slept 9 or more hours per day (adjusted odds ratio [aOR] = 1.13, 95% confidence interval [CI], 1.03–1.25) or had checkup frequency of less than 1 year (aOR = 2.31, 95% CI, 1.86–2.85) had higher risk for type 2 diabetes. CONCLUSION: Of the 8 predictive models, the neural network model gave the best model performance with the highest AUC value; however, the decision tree model is preferred for initial screening for type 2 diabetes because it had the highest sensitivity and, therefore, detection rate. We confirmed previously reported risk factors and also identified sleeping time and frequency of checkup as 2 new potential risk factors related to type 2 diabetes.
format Online
Article
Text
id pubmed-6795062
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Centers for Disease Control and Prevention
record_format MEDLINE/PubMed
spelling pubmed-67950622019-10-25 Building Risk Prediction Models for Type 2 Diabetes Using Machine Learning Techniques Xie, Zidian Nikolayeva, Olga Luo, Jiebo Li, Dongmei Prev Chronic Dis Original Research INTRODUCTION: As one of the most prevalent chronic diseases in the United States, diabetes, especially type 2 diabetes, affects the health of millions of people and puts an enormous financial burden on the US economy. We aimed to develop predictive models to identify risk factors for type 2 diabetes, which could help facilitate early diagnosis and intervention and also reduce medical costs. METHODS: We analyzed cross-sectional data on 138,146 participants, including 20,467 with type 2 diabetes, from the 2014 Behavioral Risk Factor Surveillance System. We built several machine learning models for predicting type 2 diabetes, including support vector machine, decision tree, logistic regression, random forest, neural network, and Gaussian Naive Bayes classifiers. We used univariable and multivariable weighted logistic regression models to investigate the associations of potential risk factors with type 2 diabetes. RESULTS: All predictive models for type 2 diabetes achieved a high area under the curve (AUC), ranging from 0.7182 to 0.7949. Although the neural network model had the highest accuracy (82.4%), specificity (90.2%), and AUC (0.7949), the decision tree model had the highest sensitivity (51.6%) for type 2 diabetes. We found that people who slept 9 or more hours per day (adjusted odds ratio [aOR] = 1.13, 95% confidence interval [CI], 1.03–1.25) or had checkup frequency of less than 1 year (aOR = 2.31, 95% CI, 1.86–2.85) had higher risk for type 2 diabetes. CONCLUSION: Of the 8 predictive models, the neural network model gave the best model performance with the highest AUC value; however, the decision tree model is preferred for initial screening for type 2 diabetes because it had the highest sensitivity and, therefore, detection rate. We confirmed previously reported risk factors and also identified sleeping time and frequency of checkup as 2 new potential risk factors related to type 2 diabetes. Centers for Disease Control and Prevention 2019-09-19 /pmc/articles/PMC6795062/ /pubmed/31538566 http://dx.doi.org/10.5888/pcd16.190109 Text en https://creativecommons.org/licenses/by/4.0/This is a publication of the U.S. Government. This publication is in the public domain and is therefore without copyright. All text from this work may be reprinted freely. Use of these materials should be properly cited.
spellingShingle Original Research
Xie, Zidian
Nikolayeva, Olga
Luo, Jiebo
Li, Dongmei
Building Risk Prediction Models for Type 2 Diabetes Using Machine Learning Techniques
title Building Risk Prediction Models for Type 2 Diabetes Using Machine Learning Techniques
title_full Building Risk Prediction Models for Type 2 Diabetes Using Machine Learning Techniques
title_fullStr Building Risk Prediction Models for Type 2 Diabetes Using Machine Learning Techniques
title_full_unstemmed Building Risk Prediction Models for Type 2 Diabetes Using Machine Learning Techniques
title_short Building Risk Prediction Models for Type 2 Diabetes Using Machine Learning Techniques
title_sort building risk prediction models for type 2 diabetes using machine learning techniques
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6795062/
https://www.ncbi.nlm.nih.gov/pubmed/31538566
http://dx.doi.org/10.5888/pcd16.190109
work_keys_str_mv AT xiezidian buildingriskpredictionmodelsfortype2diabetesusingmachinelearningtechniques
AT nikolayevaolga buildingriskpredictionmodelsfortype2diabetesusingmachinelearningtechniques
AT luojiebo buildingriskpredictionmodelsfortype2diabetesusingmachinelearningtechniques
AT lidongmei buildingriskpredictionmodelsfortype2diabetesusingmachinelearningtechniques