Cargando…
Tailored machine learning for evaluating the long-term diabetes risk in older individuals: findings from the Irish Longitudinal Study on Ageing (TILDA)
OBJECTIVES: The prevalence of diabetes has increased globally, leading to a significant disease burden and financial cost. Early prediction is crucial to control its prevalence. DESIGN: A prospective cohort study. SETTING: National representative study on Irish. PARTICIPANTS: 8504 individuals aged 5...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BMJ Publishing Group
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10255035/ https://www.ncbi.nlm.nih.gov/pubmed/37253496 http://dx.doi.org/10.1136/bmjopen-2023-072991 |
_version_ | 1785056776708685824 |
---|---|
author | Xu, Xuezhong Mingyang, Xue Yang, Jie Zheng, Hailong Che, Zhifei |
author_facet | Xu, Xuezhong Mingyang, Xue Yang, Jie Zheng, Hailong Che, Zhifei |
author_sort | Xu, Xuezhong |
collection | PubMed |
description | OBJECTIVES: The prevalence of diabetes has increased globally, leading to a significant disease burden and financial cost. Early prediction is crucial to control its prevalence. DESIGN: A prospective cohort study. SETTING: National representative study on Irish. PARTICIPANTS: 8504 individuals aged 50 years or older were included. PRIMARY AND SECONDARY OUTCOME MEASURES: Surveys were conducted to collect over 40 000 variables related to social, financial, health, mental and family status. Feature selection was performed using logistic regression. Different machine/deep learning algorithms were trained, including distributed random forest, extremely randomised trees, a generalised linear model with regularisation, a gradient boosting machine and a deep neural network. These algorithms were integrated into a stacked ensemble to generate the best model. The model was tested using various metrics, such as the area under the curve (AUC), log loss, mean per classification error, mean square error (MSE) and root MSE (RMSE). The SHapley Additive exPlanations (SHAP) method was used to interpret the established model. RESULTS: After 2 years, 105 baseline features were identified as major contributors to diabetes risk, including sex, low-density lipoprotein cholesterol and cirrhosis. The best model achieved high accuracy, robustness and discrimination in predicting diabetes risk, with an AUC of 0.854, log loss of 0.187, mean per classification error of 0.267, RMSE of 0.229 and MSE of 0.052 in the independent test set. The model was also shown to be well calibrated. The SHAP algorithm provided insights into the decision-making process of the model. CONCLUSIONS: These findings could help physicians in the early identification of high-risk patients and implement targeted interventions to reduce diabetes incidence. |
format | Online Article Text |
id | pubmed-10255035 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BMJ Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-102550352023-06-10 Tailored machine learning for evaluating the long-term diabetes risk in older individuals: findings from the Irish Longitudinal Study on Ageing (TILDA) Xu, Xuezhong Mingyang, Xue Yang, Jie Zheng, Hailong Che, Zhifei BMJ Open Diabetes and Endocrinology OBJECTIVES: The prevalence of diabetes has increased globally, leading to a significant disease burden and financial cost. Early prediction is crucial to control its prevalence. DESIGN: A prospective cohort study. SETTING: National representative study on Irish. PARTICIPANTS: 8504 individuals aged 50 years or older were included. PRIMARY AND SECONDARY OUTCOME MEASURES: Surveys were conducted to collect over 40 000 variables related to social, financial, health, mental and family status. Feature selection was performed using logistic regression. Different machine/deep learning algorithms were trained, including distributed random forest, extremely randomised trees, a generalised linear model with regularisation, a gradient boosting machine and a deep neural network. These algorithms were integrated into a stacked ensemble to generate the best model. The model was tested using various metrics, such as the area under the curve (AUC), log loss, mean per classification error, mean square error (MSE) and root MSE (RMSE). The SHapley Additive exPlanations (SHAP) method was used to interpret the established model. RESULTS: After 2 years, 105 baseline features were identified as major contributors to diabetes risk, including sex, low-density lipoprotein cholesterol and cirrhosis. The best model achieved high accuracy, robustness and discrimination in predicting diabetes risk, with an AUC of 0.854, log loss of 0.187, mean per classification error of 0.267, RMSE of 0.229 and MSE of 0.052 in the independent test set. The model was also shown to be well calibrated. The SHAP algorithm provided insights into the decision-making process of the model. CONCLUSIONS: These findings could help physicians in the early identification of high-risk patients and implement targeted interventions to reduce diabetes incidence. BMJ Publishing Group 2023-05-30 /pmc/articles/PMC10255035/ /pubmed/37253496 http://dx.doi.org/10.1136/bmjopen-2023-072991 Text en © Author(s) (or their employer(s)) 2023. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) . |
spellingShingle | Diabetes and Endocrinology Xu, Xuezhong Mingyang, Xue Yang, Jie Zheng, Hailong Che, Zhifei Tailored machine learning for evaluating the long-term diabetes risk in older individuals: findings from the Irish Longitudinal Study on Ageing (TILDA) |
title | Tailored machine learning for evaluating the long-term diabetes risk in older individuals: findings from the Irish Longitudinal Study on Ageing (TILDA) |
title_full | Tailored machine learning for evaluating the long-term diabetes risk in older individuals: findings from the Irish Longitudinal Study on Ageing (TILDA) |
title_fullStr | Tailored machine learning for evaluating the long-term diabetes risk in older individuals: findings from the Irish Longitudinal Study on Ageing (TILDA) |
title_full_unstemmed | Tailored machine learning for evaluating the long-term diabetes risk in older individuals: findings from the Irish Longitudinal Study on Ageing (TILDA) |
title_short | Tailored machine learning for evaluating the long-term diabetes risk in older individuals: findings from the Irish Longitudinal Study on Ageing (TILDA) |
title_sort | tailored machine learning for evaluating the long-term diabetes risk in older individuals: findings from the irish longitudinal study on ageing (tilda) |
topic | Diabetes and Endocrinology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10255035/ https://www.ncbi.nlm.nih.gov/pubmed/37253496 http://dx.doi.org/10.1136/bmjopen-2023-072991 |
work_keys_str_mv | AT xuxuezhong tailoredmachinelearningforevaluatingthelongtermdiabetesriskinolderindividualsfindingsfromtheirishlongitudinalstudyonageingtilda AT mingyangxue tailoredmachinelearningforevaluatingthelongtermdiabetesriskinolderindividualsfindingsfromtheirishlongitudinalstudyonageingtilda AT yangjie tailoredmachinelearningforevaluatingthelongtermdiabetesriskinolderindividualsfindingsfromtheirishlongitudinalstudyonageingtilda AT zhenghailong tailoredmachinelearningforevaluatingthelongtermdiabetesriskinolderindividualsfindingsfromtheirishlongitudinalstudyonageingtilda AT chezhifei tailoredmachinelearningforevaluatingthelongtermdiabetesriskinolderindividualsfindingsfromtheirishlongitudinalstudyonageingtilda |