Cargando…
Disability risk prediction model based on machine learning among Chinese healthy older adults: results from the China Health and Retirement Longitudinal Study
BACKGROUND: Predicting disability risk in healthy older adults in China is essential for timely preventive interventions, improving their quality of life, and providing scientific evidence for disability prevention. Therefore, developing a machine learning model capable of evaluating disability risk...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10665855/ https://www.ncbi.nlm.nih.gov/pubmed/38026309 http://dx.doi.org/10.3389/fpubh.2023.1271595 |
_version_ | 1785138918581075968 |
---|---|
author | Han, Yuchen Wang, Shaobing |
author_facet | Han, Yuchen Wang, Shaobing |
author_sort | Han, Yuchen |
collection | PubMed |
description | BACKGROUND: Predicting disability risk in healthy older adults in China is essential for timely preventive interventions, improving their quality of life, and providing scientific evidence for disability prevention. Therefore, developing a machine learning model capable of evaluating disability risk based on longitudinal research data is crucial. METHODS: We conducted a prospective cohort study of 2,175 older adults enrolled in the China Health and Retirement Longitudinal Study (CHARLS) between 2015 and 2018 to develop and validate this prediction model. Several machine learning algorithms (logistic regression, k-nearest neighbors, naive Bayes, multilayer perceptron, random forest, and XGBoost) were used to assess the 3-year risk of developing disability. The optimal cutoff points and adjustment parameters are explored in the training set, the prediction accuracy of the models is compared in the testing set, and the best-performing models are further interpreted. RESULTS: During a 3-year follow-up period, a total of 505 (23.22%) healthy older adult individuals developed disabilities. Among the 43 features examined, the LASSO regression identified 11 features as significant for model establishment. When comparing six different machine learning models on the testing set, the XGBoost model demonstrated the best performance across various evaluation metrics, including the highest area under the ROC curve (0.803), accuracy (0.757), sensitivity (0.790), and F1 score (0.789), while its specificity was 0.712. The decision curve analysis (DCA) indicated showed that XGBoost had the highest net benefit in most of the threshold ranges. Based on the importance of features determined by SHAP (model interpretation method), the top five important features were identified as right-hand grip strength, depressive symptoms, marital status, respiratory function, and age. Moreover, the SHAP summary plot was used to illustrate the positive or negative effects attributed to the features influenced by XGBoost. The SHAP dependence plot explained how individual features affected the output of the predictive model. CONCLUSION: Machine learning-based prediction models can accurately evaluate the likelihood of disability in healthy older adults over a period of 3 years. A combination of XGBoost and SHAP can provide clear explanations for personalized risk prediction and offer a more intuitive understanding of the effect of key features in the model. |
format | Online Article Text |
id | pubmed-10665855 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-106658552023-11-09 Disability risk prediction model based on machine learning among Chinese healthy older adults: results from the China Health and Retirement Longitudinal Study Han, Yuchen Wang, Shaobing Front Public Health Public Health BACKGROUND: Predicting disability risk in healthy older adults in China is essential for timely preventive interventions, improving their quality of life, and providing scientific evidence for disability prevention. Therefore, developing a machine learning model capable of evaluating disability risk based on longitudinal research data is crucial. METHODS: We conducted a prospective cohort study of 2,175 older adults enrolled in the China Health and Retirement Longitudinal Study (CHARLS) between 2015 and 2018 to develop and validate this prediction model. Several machine learning algorithms (logistic regression, k-nearest neighbors, naive Bayes, multilayer perceptron, random forest, and XGBoost) were used to assess the 3-year risk of developing disability. The optimal cutoff points and adjustment parameters are explored in the training set, the prediction accuracy of the models is compared in the testing set, and the best-performing models are further interpreted. RESULTS: During a 3-year follow-up period, a total of 505 (23.22%) healthy older adult individuals developed disabilities. Among the 43 features examined, the LASSO regression identified 11 features as significant for model establishment. When comparing six different machine learning models on the testing set, the XGBoost model demonstrated the best performance across various evaluation metrics, including the highest area under the ROC curve (0.803), accuracy (0.757), sensitivity (0.790), and F1 score (0.789), while its specificity was 0.712. The decision curve analysis (DCA) indicated showed that XGBoost had the highest net benefit in most of the threshold ranges. Based on the importance of features determined by SHAP (model interpretation method), the top five important features were identified as right-hand grip strength, depressive symptoms, marital status, respiratory function, and age. Moreover, the SHAP summary plot was used to illustrate the positive or negative effects attributed to the features influenced by XGBoost. The SHAP dependence plot explained how individual features affected the output of the predictive model. CONCLUSION: Machine learning-based prediction models can accurately evaluate the likelihood of disability in healthy older adults over a period of 3 years. A combination of XGBoost and SHAP can provide clear explanations for personalized risk prediction and offer a more intuitive understanding of the effect of key features in the model. Frontiers Media S.A. 2023-11-09 /pmc/articles/PMC10665855/ /pubmed/38026309 http://dx.doi.org/10.3389/fpubh.2023.1271595 Text en Copyright © 2023 Han and Wang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Public Health Han, Yuchen Wang, Shaobing Disability risk prediction model based on machine learning among Chinese healthy older adults: results from the China Health and Retirement Longitudinal Study |
title | Disability risk prediction model based on machine learning among Chinese healthy older adults: results from the China Health and Retirement Longitudinal Study |
title_full | Disability risk prediction model based on machine learning among Chinese healthy older adults: results from the China Health and Retirement Longitudinal Study |
title_fullStr | Disability risk prediction model based on machine learning among Chinese healthy older adults: results from the China Health and Retirement Longitudinal Study |
title_full_unstemmed | Disability risk prediction model based on machine learning among Chinese healthy older adults: results from the China Health and Retirement Longitudinal Study |
title_short | Disability risk prediction model based on machine learning among Chinese healthy older adults: results from the China Health and Retirement Longitudinal Study |
title_sort | disability risk prediction model based on machine learning among chinese healthy older adults: results from the china health and retirement longitudinal study |
topic | Public Health |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10665855/ https://www.ncbi.nlm.nih.gov/pubmed/38026309 http://dx.doi.org/10.3389/fpubh.2023.1271595 |
work_keys_str_mv | AT hanyuchen disabilityriskpredictionmodelbasedonmachinelearningamongchinesehealthyolderadultsresultsfromthechinahealthandretirementlongitudinalstudy AT wangshaobing disabilityriskpredictionmodelbasedonmachinelearningamongchinesehealthyolderadultsresultsfromthechinahealthandretirementlongitudinalstudy |