Cargando…

Development and validation of questionnaire-based machine learning models for predicting all-cause mortality in a representative population of China

BACKGROUND: Considering that the previously developed mortality prediction models have limited applications to the Chinese population, a questionnaire-based prediction model is of great importance for its accuracy and convenience in clinical practice. METHODS: Two national cohort, namely, the China...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Ziyi, Yang, Na, He, Liyun, Wang, Jialu, Ping, Fan, Li, Wei, Xu, Lingling, Zhang, Huabing, Li, Yuxiu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9911458/
https://www.ncbi.nlm.nih.gov/pubmed/36778549
http://dx.doi.org/10.3389/fpubh.2023.1033070
_version_ 1784884994398748672
author Li, Ziyi
Yang, Na
He, Liyun
Wang, Jialu
Ping, Fan
Li, Wei
Xu, Lingling
Zhang, Huabing
Li, Yuxiu
author_facet Li, Ziyi
Yang, Na
He, Liyun
Wang, Jialu
Ping, Fan
Li, Wei
Xu, Lingling
Zhang, Huabing
Li, Yuxiu
author_sort Li, Ziyi
collection PubMed
description BACKGROUND: Considering that the previously developed mortality prediction models have limited applications to the Chinese population, a questionnaire-based prediction model is of great importance for its accuracy and convenience in clinical practice. METHODS: Two national cohort, namely, the China Health and Nutrition Survey (8,355 individual older than 18) and the China Health and Retirement Longitudinal Study (12,711 individuals older than 45) were used for model development and validation. One hundred and fifty-nine variables were compiled to generate predictions. The Cox regression model and six machine learning (ML) models were used to predict all-cause mortality. Finally, a simple questionnaire-based ML prediction model was developed using the best algorithm and validated. RESULTS: In the internal validation set, all the ML models performed better than the traditional Cox model in predicting 6-year mortality and the random survival forest (RSF) model performed best. The questionnaire-based ML model, which only included 20 variables, achieved a C-index of 0.86 (95%CI: 0.80–0.92). On external validation, the simple questionnaire-based model achieved a C-index of 0.82 (95%CI: 0.77–0.87), 0.77 (95%CI: 0.75–0.79), and 0.79 (95%CI: 0.77–0.81), respectively, in predicting 2-, 9-, and 11-year mortality. CONCLUSIONS: In this prospective population-based study, a model based on the RSF analysis performed best among all models. Furthermore, there was no significant difference between the prediction performance of the questionnaire-based ML model, which only included 20 variables, and that of the model with all variables (including laboratory variables). The simple questionnaire-based ML prediction model, which needs to be further explored, is of great importance for its accuracy and suitability to the Chinese general population.
format Online
Article
Text
id pubmed-9911458
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-99114582023-02-11 Development and validation of questionnaire-based machine learning models for predicting all-cause mortality in a representative population of China Li, Ziyi Yang, Na He, Liyun Wang, Jialu Ping, Fan Li, Wei Xu, Lingling Zhang, Huabing Li, Yuxiu Front Public Health Public Health BACKGROUND: Considering that the previously developed mortality prediction models have limited applications to the Chinese population, a questionnaire-based prediction model is of great importance for its accuracy and convenience in clinical practice. METHODS: Two national cohort, namely, the China Health and Nutrition Survey (8,355 individual older than 18) and the China Health and Retirement Longitudinal Study (12,711 individuals older than 45) were used for model development and validation. One hundred and fifty-nine variables were compiled to generate predictions. The Cox regression model and six machine learning (ML) models were used to predict all-cause mortality. Finally, a simple questionnaire-based ML prediction model was developed using the best algorithm and validated. RESULTS: In the internal validation set, all the ML models performed better than the traditional Cox model in predicting 6-year mortality and the random survival forest (RSF) model performed best. The questionnaire-based ML model, which only included 20 variables, achieved a C-index of 0.86 (95%CI: 0.80–0.92). On external validation, the simple questionnaire-based model achieved a C-index of 0.82 (95%CI: 0.77–0.87), 0.77 (95%CI: 0.75–0.79), and 0.79 (95%CI: 0.77–0.81), respectively, in predicting 2-, 9-, and 11-year mortality. CONCLUSIONS: In this prospective population-based study, a model based on the RSF analysis performed best among all models. Furthermore, there was no significant difference between the prediction performance of the questionnaire-based ML model, which only included 20 variables, and that of the model with all variables (including laboratory variables). The simple questionnaire-based ML prediction model, which needs to be further explored, is of great importance for its accuracy and suitability to the Chinese general population. Frontiers Media S.A. 2023-01-27 /pmc/articles/PMC9911458/ /pubmed/36778549 http://dx.doi.org/10.3389/fpubh.2023.1033070 Text en Copyright © 2023 Li, Yang, He, Wang, Ping, Li, Xu, Zhang and Li. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Public Health
Li, Ziyi
Yang, Na
He, Liyun
Wang, Jialu
Ping, Fan
Li, Wei
Xu, Lingling
Zhang, Huabing
Li, Yuxiu
Development and validation of questionnaire-based machine learning models for predicting all-cause mortality in a representative population of China
title Development and validation of questionnaire-based machine learning models for predicting all-cause mortality in a representative population of China
title_full Development and validation of questionnaire-based machine learning models for predicting all-cause mortality in a representative population of China
title_fullStr Development and validation of questionnaire-based machine learning models for predicting all-cause mortality in a representative population of China
title_full_unstemmed Development and validation of questionnaire-based machine learning models for predicting all-cause mortality in a representative population of China
title_short Development and validation of questionnaire-based machine learning models for predicting all-cause mortality in a representative population of China
title_sort development and validation of questionnaire-based machine learning models for predicting all-cause mortality in a representative population of china
topic Public Health
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9911458/
https://www.ncbi.nlm.nih.gov/pubmed/36778549
http://dx.doi.org/10.3389/fpubh.2023.1033070
work_keys_str_mv AT liziyi developmentandvalidationofquestionnairebasedmachinelearningmodelsforpredictingallcausemortalityinarepresentativepopulationofchina
AT yangna developmentandvalidationofquestionnairebasedmachinelearningmodelsforpredictingallcausemortalityinarepresentativepopulationofchina
AT heliyun developmentandvalidationofquestionnairebasedmachinelearningmodelsforpredictingallcausemortalityinarepresentativepopulationofchina
AT wangjialu developmentandvalidationofquestionnairebasedmachinelearningmodelsforpredictingallcausemortalityinarepresentativepopulationofchina
AT pingfan developmentandvalidationofquestionnairebasedmachinelearningmodelsforpredictingallcausemortalityinarepresentativepopulationofchina
AT liwei developmentandvalidationofquestionnairebasedmachinelearningmodelsforpredictingallcausemortalityinarepresentativepopulationofchina
AT xulingling developmentandvalidationofquestionnairebasedmachinelearningmodelsforpredictingallcausemortalityinarepresentativepopulationofchina
AT zhanghuabing developmentandvalidationofquestionnairebasedmachinelearningmodelsforpredictingallcausemortalityinarepresentativepopulationofchina
AT liyuxiu developmentandvalidationofquestionnairebasedmachinelearningmodelsforpredictingallcausemortalityinarepresentativepopulationofchina