Cargando…
Machine learning models to predict in-hospital mortality in septic patients with diabetes
BACKGROUND: Sepsis is a leading cause of morbidity and mortality in hospitalized patients. Up to now, there are no well-established longitudinal networks from molecular mechanisms to clinical phenotypes in sepsis. Adding to the problem, about one of the five patients presented with diabetes. For thi...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9709414/ https://www.ncbi.nlm.nih.gov/pubmed/36465642 http://dx.doi.org/10.3389/fendo.2022.1034251 |
_version_ | 1784841149359325184 |
---|---|
author | Qi, Jing Lei, Jingchao Li, Nanyi Huang, Dan Liu, Huaizheng Zhou, Kefu Dai, Zheren Sun, Chuanzheng |
author_facet | Qi, Jing Lei, Jingchao Li, Nanyi Huang, Dan Liu, Huaizheng Zhou, Kefu Dai, Zheren Sun, Chuanzheng |
author_sort | Qi, Jing |
collection | PubMed |
description | BACKGROUND: Sepsis is a leading cause of morbidity and mortality in hospitalized patients. Up to now, there are no well-established longitudinal networks from molecular mechanisms to clinical phenotypes in sepsis. Adding to the problem, about one of the five patients presented with diabetes. For this subgroup, management is difficult, and prognosis is difficult to evaluate. METHODS: From the three databases, a total of 7,001 patients were enrolled on the basis of sepsis-3 standard and diabetes diagnosis. Input variable selection is based on the result of correlation analysis in a handpicking way, and 53 variables were left. A total of 5,727 records were collected from Medical Information Mart for Intensive Care database and randomly split into a training set and an internal validation set at a ratio of 7:3. Then, logistic regression with lasso regularization, Bayes logistic regression, decision tree, random forest, and XGBoost were conducted to build the predictive model by using training set. Then, the models were tested by the internal validation set. The data from eICU Collaborative Research Database (n = 815) and dtChina critical care database (n = 459) were used to test the model performance as the external validation set. RESULTS: In the internal validation set, the accuracy values of logistic regression with lasso regularization, Bayes logistic regression, decision tree, random forest, and XGBoost were 0.878, 0.883, 0.865, 0.883, and 0.882, respectively. Likewise, in the external validation set 1, lasso regularization = 0.879, Bayes logistic regression = 0.877, decision tree = 0.865, random forest = 0.886, and XGBoost = 0.875. In the external validation set 2, lasso regularization = 0.715, Bayes logistic regression = 0.745, decision tree = 0.763, random forest = 0.760, and XGBoost = 0.699. CONCLUSION: The top three models for internal validation set were Bayes logistic regression, random forest, and XGBoost, whereas the top three models for external validation set 1 were random forest, logistic regression, and Bayes logistic regression. In addition, the top three models for the external validation set 2 were decision tree, random forest, and Bayes logistic regression. Random forest model performed well with the training and three validation sets. The most important features are age, albumin, and lactate. |
format | Online Article Text |
id | pubmed-9709414 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-97094142022-12-01 Machine learning models to predict in-hospital mortality in septic patients with diabetes Qi, Jing Lei, Jingchao Li, Nanyi Huang, Dan Liu, Huaizheng Zhou, Kefu Dai, Zheren Sun, Chuanzheng Front Endocrinol (Lausanne) Endocrinology BACKGROUND: Sepsis is a leading cause of morbidity and mortality in hospitalized patients. Up to now, there are no well-established longitudinal networks from molecular mechanisms to clinical phenotypes in sepsis. Adding to the problem, about one of the five patients presented with diabetes. For this subgroup, management is difficult, and prognosis is difficult to evaluate. METHODS: From the three databases, a total of 7,001 patients were enrolled on the basis of sepsis-3 standard and diabetes diagnosis. Input variable selection is based on the result of correlation analysis in a handpicking way, and 53 variables were left. A total of 5,727 records were collected from Medical Information Mart for Intensive Care database and randomly split into a training set and an internal validation set at a ratio of 7:3. Then, logistic regression with lasso regularization, Bayes logistic regression, decision tree, random forest, and XGBoost were conducted to build the predictive model by using training set. Then, the models were tested by the internal validation set. The data from eICU Collaborative Research Database (n = 815) and dtChina critical care database (n = 459) were used to test the model performance as the external validation set. RESULTS: In the internal validation set, the accuracy values of logistic regression with lasso regularization, Bayes logistic regression, decision tree, random forest, and XGBoost were 0.878, 0.883, 0.865, 0.883, and 0.882, respectively. Likewise, in the external validation set 1, lasso regularization = 0.879, Bayes logistic regression = 0.877, decision tree = 0.865, random forest = 0.886, and XGBoost = 0.875. In the external validation set 2, lasso regularization = 0.715, Bayes logistic regression = 0.745, decision tree = 0.763, random forest = 0.760, and XGBoost = 0.699. CONCLUSION: The top three models for internal validation set were Bayes logistic regression, random forest, and XGBoost, whereas the top three models for external validation set 1 were random forest, logistic regression, and Bayes logistic regression. In addition, the top three models for the external validation set 2 were decision tree, random forest, and Bayes logistic regression. Random forest model performed well with the training and three validation sets. The most important features are age, albumin, and lactate. Frontiers Media S.A. 2022-11-16 /pmc/articles/PMC9709414/ /pubmed/36465642 http://dx.doi.org/10.3389/fendo.2022.1034251 Text en Copyright © 2022 Qi, Lei, Li, Huang, Liu, Zhou, Dai and Sun https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Endocrinology Qi, Jing Lei, Jingchao Li, Nanyi Huang, Dan Liu, Huaizheng Zhou, Kefu Dai, Zheren Sun, Chuanzheng Machine learning models to predict in-hospital mortality in septic patients with diabetes |
title | Machine learning models to predict in-hospital mortality in septic patients with diabetes |
title_full | Machine learning models to predict in-hospital mortality in septic patients with diabetes |
title_fullStr | Machine learning models to predict in-hospital mortality in septic patients with diabetes |
title_full_unstemmed | Machine learning models to predict in-hospital mortality in septic patients with diabetes |
title_short | Machine learning models to predict in-hospital mortality in septic patients with diabetes |
title_sort | machine learning models to predict in-hospital mortality in septic patients with diabetes |
topic | Endocrinology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9709414/ https://www.ncbi.nlm.nih.gov/pubmed/36465642 http://dx.doi.org/10.3389/fendo.2022.1034251 |
work_keys_str_mv | AT qijing machinelearningmodelstopredictinhospitalmortalityinsepticpatientswithdiabetes AT leijingchao machinelearningmodelstopredictinhospitalmortalityinsepticpatientswithdiabetes AT linanyi machinelearningmodelstopredictinhospitalmortalityinsepticpatientswithdiabetes AT huangdan machinelearningmodelstopredictinhospitalmortalityinsepticpatientswithdiabetes AT liuhuaizheng machinelearningmodelstopredictinhospitalmortalityinsepticpatientswithdiabetes AT zhoukefu machinelearningmodelstopredictinhospitalmortalityinsepticpatientswithdiabetes AT daizheren machinelearningmodelstopredictinhospitalmortalityinsepticpatientswithdiabetes AT sunchuanzheng machinelearningmodelstopredictinhospitalmortalityinsepticpatientswithdiabetes |