Cargando…

Tailored machine learning for evaluating the long-term diabetes risk in older individuals: findings from the Irish Longitudinal Study on Ageing (TILDA)

OBJECTIVES: The prevalence of diabetes has increased globally, leading to a significant disease burden and financial cost. Early prediction is crucial to control its prevalence. DESIGN: A prospective cohort study. SETTING: National representative study on Irish. PARTICIPANTS: 8504 individuals aged 5...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, Xuezhong, Mingyang, Xue, Yang, Jie, Zheng, Hailong, Che, Zhifei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BMJ Publishing Group 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10255035/
https://www.ncbi.nlm.nih.gov/pubmed/37253496
http://dx.doi.org/10.1136/bmjopen-2023-072991
_version_ 1785056776708685824
author Xu, Xuezhong
Mingyang, Xue
Yang, Jie
Zheng, Hailong
Che, Zhifei
author_facet Xu, Xuezhong
Mingyang, Xue
Yang, Jie
Zheng, Hailong
Che, Zhifei
author_sort Xu, Xuezhong
collection PubMed
description OBJECTIVES: The prevalence of diabetes has increased globally, leading to a significant disease burden and financial cost. Early prediction is crucial to control its prevalence. DESIGN: A prospective cohort study. SETTING: National representative study on Irish. PARTICIPANTS: 8504 individuals aged 50 years or older were included. PRIMARY AND SECONDARY OUTCOME MEASURES: Surveys were conducted to collect over 40 000 variables related to social, financial, health, mental and family status. Feature selection was performed using logistic regression. Different machine/deep learning algorithms were trained, including distributed random forest, extremely randomised trees, a generalised linear model with regularisation, a gradient boosting machine and a deep neural network. These algorithms were integrated into a stacked ensemble to generate the best model. The model was tested using various metrics, such as the area under the curve (AUC), log loss, mean per classification error, mean square error (MSE) and root MSE (RMSE). The SHapley Additive exPlanations (SHAP) method was used to interpret the established model. RESULTS: After 2 years, 105 baseline features were identified as major contributors to diabetes risk, including sex, low-density lipoprotein cholesterol and cirrhosis. The best model achieved high accuracy, robustness and discrimination in predicting diabetes risk, with an AUC of 0.854, log loss of 0.187, mean per classification error of 0.267, RMSE of 0.229 and MSE of 0.052 in the independent test set. The model was also shown to be well calibrated. The SHAP algorithm provided insights into the decision-making process of the model. CONCLUSIONS: These findings could help physicians in the early identification of high-risk patients and implement targeted interventions to reduce diabetes incidence.
format Online
Article
Text
id pubmed-10255035
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BMJ Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-102550352023-06-10 Tailored machine learning for evaluating the long-term diabetes risk in older individuals: findings from the Irish Longitudinal Study on Ageing (TILDA) Xu, Xuezhong Mingyang, Xue Yang, Jie Zheng, Hailong Che, Zhifei BMJ Open Diabetes and Endocrinology OBJECTIVES: The prevalence of diabetes has increased globally, leading to a significant disease burden and financial cost. Early prediction is crucial to control its prevalence. DESIGN: A prospective cohort study. SETTING: National representative study on Irish. PARTICIPANTS: 8504 individuals aged 50 years or older were included. PRIMARY AND SECONDARY OUTCOME MEASURES: Surveys were conducted to collect over 40 000 variables related to social, financial, health, mental and family status. Feature selection was performed using logistic regression. Different machine/deep learning algorithms were trained, including distributed random forest, extremely randomised trees, a generalised linear model with regularisation, a gradient boosting machine and a deep neural network. These algorithms were integrated into a stacked ensemble to generate the best model. The model was tested using various metrics, such as the area under the curve (AUC), log loss, mean per classification error, mean square error (MSE) and root MSE (RMSE). The SHapley Additive exPlanations (SHAP) method was used to interpret the established model. RESULTS: After 2 years, 105 baseline features were identified as major contributors to diabetes risk, including sex, low-density lipoprotein cholesterol and cirrhosis. The best model achieved high accuracy, robustness and discrimination in predicting diabetes risk, with an AUC of 0.854, log loss of 0.187, mean per classification error of 0.267, RMSE of 0.229 and MSE of 0.052 in the independent test set. The model was also shown to be well calibrated. The SHAP algorithm provided insights into the decision-making process of the model. CONCLUSIONS: These findings could help physicians in the early identification of high-risk patients and implement targeted interventions to reduce diabetes incidence. BMJ Publishing Group 2023-05-30 /pmc/articles/PMC10255035/ /pubmed/37253496 http://dx.doi.org/10.1136/bmjopen-2023-072991 Text en © Author(s) (or their employer(s)) 2023. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) .
spellingShingle Diabetes and Endocrinology
Xu, Xuezhong
Mingyang, Xue
Yang, Jie
Zheng, Hailong
Che, Zhifei
Tailored machine learning for evaluating the long-term diabetes risk in older individuals: findings from the Irish Longitudinal Study on Ageing (TILDA)
title Tailored machine learning for evaluating the long-term diabetes risk in older individuals: findings from the Irish Longitudinal Study on Ageing (TILDA)
title_full Tailored machine learning for evaluating the long-term diabetes risk in older individuals: findings from the Irish Longitudinal Study on Ageing (TILDA)
title_fullStr Tailored machine learning for evaluating the long-term diabetes risk in older individuals: findings from the Irish Longitudinal Study on Ageing (TILDA)
title_full_unstemmed Tailored machine learning for evaluating the long-term diabetes risk in older individuals: findings from the Irish Longitudinal Study on Ageing (TILDA)
title_short Tailored machine learning for evaluating the long-term diabetes risk in older individuals: findings from the Irish Longitudinal Study on Ageing (TILDA)
title_sort tailored machine learning for evaluating the long-term diabetes risk in older individuals: findings from the irish longitudinal study on ageing (tilda)
topic Diabetes and Endocrinology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10255035/
https://www.ncbi.nlm.nih.gov/pubmed/37253496
http://dx.doi.org/10.1136/bmjopen-2023-072991
work_keys_str_mv AT xuxuezhong tailoredmachinelearningforevaluatingthelongtermdiabetesriskinolderindividualsfindingsfromtheirishlongitudinalstudyonageingtilda
AT mingyangxue tailoredmachinelearningforevaluatingthelongtermdiabetesriskinolderindividualsfindingsfromtheirishlongitudinalstudyonageingtilda
AT yangjie tailoredmachinelearningforevaluatingthelongtermdiabetesriskinolderindividualsfindingsfromtheirishlongitudinalstudyonageingtilda
AT zhenghailong tailoredmachinelearningforevaluatingthelongtermdiabetesriskinolderindividualsfindingsfromtheirishlongitudinalstudyonageingtilda
AT chezhifei tailoredmachinelearningforevaluatingthelongtermdiabetesriskinolderindividualsfindingsfromtheirishlongitudinalstudyonageingtilda