Cargando…

Predicting Diabetes in Patients with Metabolic Syndrome Using Machine-Learning Model Based on Multiple Years’ Data

PURPOSE: To evaluate the performance of machine-learning models based on multiple years of continuous data to predict incident diabetes among patients with metabolic syndrome. PATIENTS AND METHODS: The dataset comprises the health records from 2008 to 2020 including 4510 nondiabetic participants wit...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Jing, Xu, Zheng, Xu, Tengda, Lin, Songbai
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Dove 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9525025/
https://www.ncbi.nlm.nih.gov/pubmed/36186938
http://dx.doi.org/10.2147/DMSO.S381146
_version_ 1784800619136024576
author Li, Jing
Xu, Zheng
Xu, Tengda
Lin, Songbai
author_facet Li, Jing
Xu, Zheng
Xu, Tengda
Lin, Songbai
author_sort Li, Jing
collection PubMed
description PURPOSE: To evaluate the performance of machine-learning models based on multiple years of continuous data to predict incident diabetes among patients with metabolic syndrome. PATIENTS AND METHODS: The dataset comprises the health records from 2008 to 2020 including 4510 nondiabetic participants with metabolic syndrome (MetS) at baseline and with at least 6 years of records. MetS was defined according to the International Diabetes Federation (IDF) criteria. Overall, 332 patients developed incident diabetes during the 7±1.4 years of follow-up. Three popular classification algorithms were evaluated on the dataset: logistic regression, random forest, and Xgboost. Five models including single-year models (year 1, year 2, and year 3) and multiple-year models (year 1–2 and year 1–3) were developed for each algorithm. RESULTS: The model performances improved with the increasing longitudinal dataset as the area under the receiver operating characteristic curve (AUROC) was boosted for both random forest (year 1–3: AUROC=0.893; year 3: AUROC=0.862; year 1–2: AUROC=0.847; year 2: AUROC=0.838) and Xgboost (year 1–3: AUROC=0.897; year 3: AUROC=0.833; year 1–2: AUROC=0.856; year 2: AUROC=0.823) model. In the multiple-year models, the highest fasting plasma glucose, followed by the mean or lowest level of HbA1c and BMI had the most important predictive value for the onset of diabetes. In the “1–3” year model, “delta weight” which reflects the fluctuations of yearly change of weight was the fourth-most important feature. CONCLUSION: This study demonstrated improved performance with the accumulation of longitudinal data when using machine learning for diabetes prediction in MetS patients. For individuals with similar clinical parameters, the variation trends of these parameters could change the risk of future diabetes. This result indicated that models based on longitudinal multiple years’ data may provide more personalized assessment tools for risk evaluation.
format Online
Article
Text
id pubmed-9525025
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Dove
record_format MEDLINE/PubMed
spelling pubmed-95250252022-10-01 Predicting Diabetes in Patients with Metabolic Syndrome Using Machine-Learning Model Based on Multiple Years’ Data Li, Jing Xu, Zheng Xu, Tengda Lin, Songbai Diabetes Metab Syndr Obes Original Research PURPOSE: To evaluate the performance of machine-learning models based on multiple years of continuous data to predict incident diabetes among patients with metabolic syndrome. PATIENTS AND METHODS: The dataset comprises the health records from 2008 to 2020 including 4510 nondiabetic participants with metabolic syndrome (MetS) at baseline and with at least 6 years of records. MetS was defined according to the International Diabetes Federation (IDF) criteria. Overall, 332 patients developed incident diabetes during the 7±1.4 years of follow-up. Three popular classification algorithms were evaluated on the dataset: logistic regression, random forest, and Xgboost. Five models including single-year models (year 1, year 2, and year 3) and multiple-year models (year 1–2 and year 1–3) were developed for each algorithm. RESULTS: The model performances improved with the increasing longitudinal dataset as the area under the receiver operating characteristic curve (AUROC) was boosted for both random forest (year 1–3: AUROC=0.893; year 3: AUROC=0.862; year 1–2: AUROC=0.847; year 2: AUROC=0.838) and Xgboost (year 1–3: AUROC=0.897; year 3: AUROC=0.833; year 1–2: AUROC=0.856; year 2: AUROC=0.823) model. In the multiple-year models, the highest fasting plasma glucose, followed by the mean or lowest level of HbA1c and BMI had the most important predictive value for the onset of diabetes. In the “1–3” year model, “delta weight” which reflects the fluctuations of yearly change of weight was the fourth-most important feature. CONCLUSION: This study demonstrated improved performance with the accumulation of longitudinal data when using machine learning for diabetes prediction in MetS patients. For individuals with similar clinical parameters, the variation trends of these parameters could change the risk of future diabetes. This result indicated that models based on longitudinal multiple years’ data may provide more personalized assessment tools for risk evaluation. Dove 2022-09-26 /pmc/articles/PMC9525025/ /pubmed/36186938 http://dx.doi.org/10.2147/DMSO.S381146 Text en © 2022 Li et al. https://creativecommons.org/licenses/by-nc/3.0/This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution – Non Commercial (unported, v3.0) License (http://creativecommons.org/licenses/by-nc/3.0/ (https://creativecommons.org/licenses/by-nc/3.0/) ). By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms (https://www.dovepress.com/terms.php).
spellingShingle Original Research
Li, Jing
Xu, Zheng
Xu, Tengda
Lin, Songbai
Predicting Diabetes in Patients with Metabolic Syndrome Using Machine-Learning Model Based on Multiple Years’ Data
title Predicting Diabetes in Patients with Metabolic Syndrome Using Machine-Learning Model Based on Multiple Years’ Data
title_full Predicting Diabetes in Patients with Metabolic Syndrome Using Machine-Learning Model Based on Multiple Years’ Data
title_fullStr Predicting Diabetes in Patients with Metabolic Syndrome Using Machine-Learning Model Based on Multiple Years’ Data
title_full_unstemmed Predicting Diabetes in Patients with Metabolic Syndrome Using Machine-Learning Model Based on Multiple Years’ Data
title_short Predicting Diabetes in Patients with Metabolic Syndrome Using Machine-Learning Model Based on Multiple Years’ Data
title_sort predicting diabetes in patients with metabolic syndrome using machine-learning model based on multiple years’ data
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9525025/
https://www.ncbi.nlm.nih.gov/pubmed/36186938
http://dx.doi.org/10.2147/DMSO.S381146
work_keys_str_mv AT lijing predictingdiabetesinpatientswithmetabolicsyndromeusingmachinelearningmodelbasedonmultipleyearsdata
AT xuzheng predictingdiabetesinpatientswithmetabolicsyndromeusingmachinelearningmodelbasedonmultipleyearsdata
AT xutengda predictingdiabetesinpatientswithmetabolicsyndromeusingmachinelearningmodelbasedonmultipleyearsdata
AT linsongbai predictingdiabetesinpatientswithmetabolicsyndromeusingmachinelearningmodelbasedonmultipleyearsdata