Cargando…

Machine learning models to predict in-hospital mortality in septic patients with diabetes

BACKGROUND: Sepsis is a leading cause of morbidity and mortality in hospitalized patients. Up to now, there are no well-established longitudinal networks from molecular mechanisms to clinical phenotypes in sepsis. Adding to the problem, about one of the five patients presented with diabetes. For thi...

Descripción completa

Detalles Bibliográficos
Autores principales: Qi, Jing, Lei, Jingchao, Li, Nanyi, Huang, Dan, Liu, Huaizheng, Zhou, Kefu, Dai, Zheren, Sun, Chuanzheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9709414/
https://www.ncbi.nlm.nih.gov/pubmed/36465642
http://dx.doi.org/10.3389/fendo.2022.1034251
_version_ 1784841149359325184
author Qi, Jing
Lei, Jingchao
Li, Nanyi
Huang, Dan
Liu, Huaizheng
Zhou, Kefu
Dai, Zheren
Sun, Chuanzheng
author_facet Qi, Jing
Lei, Jingchao
Li, Nanyi
Huang, Dan
Liu, Huaizheng
Zhou, Kefu
Dai, Zheren
Sun, Chuanzheng
author_sort Qi, Jing
collection PubMed
description BACKGROUND: Sepsis is a leading cause of morbidity and mortality in hospitalized patients. Up to now, there are no well-established longitudinal networks from molecular mechanisms to clinical phenotypes in sepsis. Adding to the problem, about one of the five patients presented with diabetes. For this subgroup, management is difficult, and prognosis is difficult to evaluate. METHODS: From the three databases, a total of 7,001 patients were enrolled on the basis of sepsis-3 standard and diabetes diagnosis. Input variable selection is based on the result of correlation analysis in a handpicking way, and 53 variables were left. A total of 5,727 records were collected from Medical Information Mart for Intensive Care database and randomly split into a training set and an internal validation set at a ratio of 7:3. Then, logistic regression with lasso regularization, Bayes logistic regression, decision tree, random forest, and XGBoost were conducted to build the predictive model by using training set. Then, the models were tested by the internal validation set. The data from eICU Collaborative Research Database (n = 815) and dtChina critical care database (n = 459) were used to test the model performance as the external validation set. RESULTS: In the internal validation set, the accuracy values of logistic regression with lasso regularization, Bayes logistic regression, decision tree, random forest, and XGBoost were 0.878, 0.883, 0.865, 0.883, and 0.882, respectively. Likewise, in the external validation set 1, lasso regularization = 0.879, Bayes logistic regression = 0.877, decision tree = 0.865, random forest = 0.886, and XGBoost = 0.875. In the external validation set 2, lasso regularization = 0.715, Bayes logistic regression = 0.745, decision tree = 0.763, random forest = 0.760, and XGBoost = 0.699. CONCLUSION: The top three models for internal validation set were Bayes logistic regression, random forest, and XGBoost, whereas the top three models for external validation set 1 were random forest, logistic regression, and Bayes logistic regression. In addition, the top three models for the external validation set 2 were decision tree, random forest, and Bayes logistic regression. Random forest model performed well with the training and three validation sets. The most important features are age, albumin, and lactate.
format Online
Article
Text
id pubmed-9709414
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-97094142022-12-01 Machine learning models to predict in-hospital mortality in septic patients with diabetes Qi, Jing Lei, Jingchao Li, Nanyi Huang, Dan Liu, Huaizheng Zhou, Kefu Dai, Zheren Sun, Chuanzheng Front Endocrinol (Lausanne) Endocrinology BACKGROUND: Sepsis is a leading cause of morbidity and mortality in hospitalized patients. Up to now, there are no well-established longitudinal networks from molecular mechanisms to clinical phenotypes in sepsis. Adding to the problem, about one of the five patients presented with diabetes. For this subgroup, management is difficult, and prognosis is difficult to evaluate. METHODS: From the three databases, a total of 7,001 patients were enrolled on the basis of sepsis-3 standard and diabetes diagnosis. Input variable selection is based on the result of correlation analysis in a handpicking way, and 53 variables were left. A total of 5,727 records were collected from Medical Information Mart for Intensive Care database and randomly split into a training set and an internal validation set at a ratio of 7:3. Then, logistic regression with lasso regularization, Bayes logistic regression, decision tree, random forest, and XGBoost were conducted to build the predictive model by using training set. Then, the models were tested by the internal validation set. The data from eICU Collaborative Research Database (n = 815) and dtChina critical care database (n = 459) were used to test the model performance as the external validation set. RESULTS: In the internal validation set, the accuracy values of logistic regression with lasso regularization, Bayes logistic regression, decision tree, random forest, and XGBoost were 0.878, 0.883, 0.865, 0.883, and 0.882, respectively. Likewise, in the external validation set 1, lasso regularization = 0.879, Bayes logistic regression = 0.877, decision tree = 0.865, random forest = 0.886, and XGBoost = 0.875. In the external validation set 2, lasso regularization = 0.715, Bayes logistic regression = 0.745, decision tree = 0.763, random forest = 0.760, and XGBoost = 0.699. CONCLUSION: The top three models for internal validation set were Bayes logistic regression, random forest, and XGBoost, whereas the top three models for external validation set 1 were random forest, logistic regression, and Bayes logistic regression. In addition, the top three models for the external validation set 2 were decision tree, random forest, and Bayes logistic regression. Random forest model performed well with the training and three validation sets. The most important features are age, albumin, and lactate. Frontiers Media S.A. 2022-11-16 /pmc/articles/PMC9709414/ /pubmed/36465642 http://dx.doi.org/10.3389/fendo.2022.1034251 Text en Copyright © 2022 Qi, Lei, Li, Huang, Liu, Zhou, Dai and Sun https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Endocrinology
Qi, Jing
Lei, Jingchao
Li, Nanyi
Huang, Dan
Liu, Huaizheng
Zhou, Kefu
Dai, Zheren
Sun, Chuanzheng
Machine learning models to predict in-hospital mortality in septic patients with diabetes
title Machine learning models to predict in-hospital mortality in septic patients with diabetes
title_full Machine learning models to predict in-hospital mortality in septic patients with diabetes
title_fullStr Machine learning models to predict in-hospital mortality in septic patients with diabetes
title_full_unstemmed Machine learning models to predict in-hospital mortality in septic patients with diabetes
title_short Machine learning models to predict in-hospital mortality in septic patients with diabetes
title_sort machine learning models to predict in-hospital mortality in septic patients with diabetes
topic Endocrinology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9709414/
https://www.ncbi.nlm.nih.gov/pubmed/36465642
http://dx.doi.org/10.3389/fendo.2022.1034251
work_keys_str_mv AT qijing machinelearningmodelstopredictinhospitalmortalityinsepticpatientswithdiabetes
AT leijingchao machinelearningmodelstopredictinhospitalmortalityinsepticpatientswithdiabetes
AT linanyi machinelearningmodelstopredictinhospitalmortalityinsepticpatientswithdiabetes
AT huangdan machinelearningmodelstopredictinhospitalmortalityinsepticpatientswithdiabetes
AT liuhuaizheng machinelearningmodelstopredictinhospitalmortalityinsepticpatientswithdiabetes
AT zhoukefu machinelearningmodelstopredictinhospitalmortalityinsepticpatientswithdiabetes
AT daizheren machinelearningmodelstopredictinhospitalmortalityinsepticpatientswithdiabetes
AT sunchuanzheng machinelearningmodelstopredictinhospitalmortalityinsepticpatientswithdiabetes