Cargando…
Predicting Diabetes in Patients with Metabolic Syndrome Using Machine-Learning Model Based on Multiple Years’ Data
PURPOSE: To evaluate the performance of machine-learning models based on multiple years of continuous data to predict incident diabetes among patients with metabolic syndrome. PATIENTS AND METHODS: The dataset comprises the health records from 2008 to 2020 including 4510 nondiabetic participants wit...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Dove
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9525025/ https://www.ncbi.nlm.nih.gov/pubmed/36186938 http://dx.doi.org/10.2147/DMSO.S381146 |
_version_ | 1784800619136024576 |
---|---|
author | Li, Jing Xu, Zheng Xu, Tengda Lin, Songbai |
author_facet | Li, Jing Xu, Zheng Xu, Tengda Lin, Songbai |
author_sort | Li, Jing |
collection | PubMed |
description | PURPOSE: To evaluate the performance of machine-learning models based on multiple years of continuous data to predict incident diabetes among patients with metabolic syndrome. PATIENTS AND METHODS: The dataset comprises the health records from 2008 to 2020 including 4510 nondiabetic participants with metabolic syndrome (MetS) at baseline and with at least 6 years of records. MetS was defined according to the International Diabetes Federation (IDF) criteria. Overall, 332 patients developed incident diabetes during the 7±1.4 years of follow-up. Three popular classification algorithms were evaluated on the dataset: logistic regression, random forest, and Xgboost. Five models including single-year models (year 1, year 2, and year 3) and multiple-year models (year 1–2 and year 1–3) were developed for each algorithm. RESULTS: The model performances improved with the increasing longitudinal dataset as the area under the receiver operating characteristic curve (AUROC) was boosted for both random forest (year 1–3: AUROC=0.893; year 3: AUROC=0.862; year 1–2: AUROC=0.847; year 2: AUROC=0.838) and Xgboost (year 1–3: AUROC=0.897; year 3: AUROC=0.833; year 1–2: AUROC=0.856; year 2: AUROC=0.823) model. In the multiple-year models, the highest fasting plasma glucose, followed by the mean or lowest level of HbA1c and BMI had the most important predictive value for the onset of diabetes. In the “1–3” year model, “delta weight” which reflects the fluctuations of yearly change of weight was the fourth-most important feature. CONCLUSION: This study demonstrated improved performance with the accumulation of longitudinal data when using machine learning for diabetes prediction in MetS patients. For individuals with similar clinical parameters, the variation trends of these parameters could change the risk of future diabetes. This result indicated that models based on longitudinal multiple years’ data may provide more personalized assessment tools for risk evaluation. |
format | Online Article Text |
id | pubmed-9525025 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Dove |
record_format | MEDLINE/PubMed |
spelling | pubmed-95250252022-10-01 Predicting Diabetes in Patients with Metabolic Syndrome Using Machine-Learning Model Based on Multiple Years’ Data Li, Jing Xu, Zheng Xu, Tengda Lin, Songbai Diabetes Metab Syndr Obes Original Research PURPOSE: To evaluate the performance of machine-learning models based on multiple years of continuous data to predict incident diabetes among patients with metabolic syndrome. PATIENTS AND METHODS: The dataset comprises the health records from 2008 to 2020 including 4510 nondiabetic participants with metabolic syndrome (MetS) at baseline and with at least 6 years of records. MetS was defined according to the International Diabetes Federation (IDF) criteria. Overall, 332 patients developed incident diabetes during the 7±1.4 years of follow-up. Three popular classification algorithms were evaluated on the dataset: logistic regression, random forest, and Xgboost. Five models including single-year models (year 1, year 2, and year 3) and multiple-year models (year 1–2 and year 1–3) were developed for each algorithm. RESULTS: The model performances improved with the increasing longitudinal dataset as the area under the receiver operating characteristic curve (AUROC) was boosted for both random forest (year 1–3: AUROC=0.893; year 3: AUROC=0.862; year 1–2: AUROC=0.847; year 2: AUROC=0.838) and Xgboost (year 1–3: AUROC=0.897; year 3: AUROC=0.833; year 1–2: AUROC=0.856; year 2: AUROC=0.823) model. In the multiple-year models, the highest fasting plasma glucose, followed by the mean or lowest level of HbA1c and BMI had the most important predictive value for the onset of diabetes. In the “1–3” year model, “delta weight” which reflects the fluctuations of yearly change of weight was the fourth-most important feature. CONCLUSION: This study demonstrated improved performance with the accumulation of longitudinal data when using machine learning for diabetes prediction in MetS patients. For individuals with similar clinical parameters, the variation trends of these parameters could change the risk of future diabetes. This result indicated that models based on longitudinal multiple years’ data may provide more personalized assessment tools for risk evaluation. Dove 2022-09-26 /pmc/articles/PMC9525025/ /pubmed/36186938 http://dx.doi.org/10.2147/DMSO.S381146 Text en © 2022 Li et al. https://creativecommons.org/licenses/by-nc/3.0/This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution – Non Commercial (unported, v3.0) License (http://creativecommons.org/licenses/by-nc/3.0/ (https://creativecommons.org/licenses/by-nc/3.0/) ). By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms (https://www.dovepress.com/terms.php). |
spellingShingle | Original Research Li, Jing Xu, Zheng Xu, Tengda Lin, Songbai Predicting Diabetes in Patients with Metabolic Syndrome Using Machine-Learning Model Based on Multiple Years’ Data |
title | Predicting Diabetes in Patients with Metabolic Syndrome Using Machine-Learning Model Based on Multiple Years’ Data |
title_full | Predicting Diabetes in Patients with Metabolic Syndrome Using Machine-Learning Model Based on Multiple Years’ Data |
title_fullStr | Predicting Diabetes in Patients with Metabolic Syndrome Using Machine-Learning Model Based on Multiple Years’ Data |
title_full_unstemmed | Predicting Diabetes in Patients with Metabolic Syndrome Using Machine-Learning Model Based on Multiple Years’ Data |
title_short | Predicting Diabetes in Patients with Metabolic Syndrome Using Machine-Learning Model Based on Multiple Years’ Data |
title_sort | predicting diabetes in patients with metabolic syndrome using machine-learning model based on multiple years’ data |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9525025/ https://www.ncbi.nlm.nih.gov/pubmed/36186938 http://dx.doi.org/10.2147/DMSO.S381146 |
work_keys_str_mv | AT lijing predictingdiabetesinpatientswithmetabolicsyndromeusingmachinelearningmodelbasedonmultipleyearsdata AT xuzheng predictingdiabetesinpatientswithmetabolicsyndromeusingmachinelearningmodelbasedonmultipleyearsdata AT xutengda predictingdiabetesinpatientswithmetabolicsyndromeusingmachinelearningmodelbasedonmultipleyearsdata AT linsongbai predictingdiabetesinpatientswithmetabolicsyndromeusingmachinelearningmodelbasedonmultipleyearsdata |