Cargando…
Temporal Generalizability of Machine Learning Models for Predicting Postoperative Delirium Using Electronic Health Record Data: Model Development and Validation Study
BACKGROUND: Although machine learning models demonstrate significant potential in predicting postoperative delirium, the advantages of their implementation in real-world settings remain unclear and require a comparison with conventional models in practical applications. OBJECTIVE: The objective of t...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10636625/ https://www.ncbi.nlm.nih.gov/pubmed/37883164 http://dx.doi.org/10.2196/50895 |
_version_ | 1785133246763237376 |
---|---|
author | Matsumoto, Koutarou Nohara, Yasunobu Sakaguchi, Mikako Takayama, Yohei Fukushige, Syota Soejima, Hidehisa Nakashima, Naoki Kamouchi, Masahiro |
author_facet | Matsumoto, Koutarou Nohara, Yasunobu Sakaguchi, Mikako Takayama, Yohei Fukushige, Syota Soejima, Hidehisa Nakashima, Naoki Kamouchi, Masahiro |
author_sort | Matsumoto, Koutarou |
collection | PubMed |
description | BACKGROUND: Although machine learning models demonstrate significant potential in predicting postoperative delirium, the advantages of their implementation in real-world settings remain unclear and require a comparison with conventional models in practical applications. OBJECTIVE: The objective of this study was to validate the temporal generalizability of decision tree ensemble and sparse linear regression models for predicting delirium after surgery compared with that of the traditional logistic regression model. METHODS: The health record data of patients hospitalized at an advanced emergency and critical care medical center in Kumamoto, Japan, were collected electronically. We developed a decision tree ensemble model using extreme gradient boosting (XGBoost) and a sparse linear regression model using least absolute shrinkage and selection operator (LASSO) regression. To evaluate the predictive performance of the model, we used the area under the receiver operating characteristic curve (AUROC) and the Matthews correlation coefficient (MCC) to measure discrimination and the slope and intercept of the regression between predicted and observed probabilities to measure calibration. The Brier score was evaluated as an overall performance metric. We included 11,863 consecutive patients who underwent surgery with general anesthesia between December 2017 and February 2022. The patients were divided into a derivation cohort before the COVID-19 pandemic and a validation cohort during the COVID-19 pandemic. Postoperative delirium was diagnosed according to the confusion assessment method. RESULTS: A total of 6497 patients (68.5, SD 14.4 years, women n=2627, 40.4%) were included in the derivation cohort, and 5366 patients (67.8, SD 14.6 years, women n=2105, 39.2%) were included in the validation cohort. Regarding discrimination, the XGBoost model (AUROC 0.87-0.90 and MCC 0.34-0.44) did not significantly outperform the LASSO model (AUROC 0.86-0.89 and MCC 0.34-0.41). The logistic regression model (AUROC 0.84-0.88, MCC 0.33-0.40, slope 1.01-1.19, intercept –0.16 to 0.06, and Brier score 0.06-0.07), with 8 predictors (age, intensive care unit, neurosurgery, emergency admission, anesthesia time, BMI, blood loss during surgery, and use of an ambulance) achieved good predictive performance. CONCLUSIONS: The XGBoost model did not significantly outperform the LASSO model in predicting postoperative delirium. Furthermore, a parsimonious logistic model with a few important predictors achieved comparable performance to machine learning models in predicting postoperative delirium. |
format | Online Article Text |
id | pubmed-10636625 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-106366252023-11-11 Temporal Generalizability of Machine Learning Models for Predicting Postoperative Delirium Using Electronic Health Record Data: Model Development and Validation Study Matsumoto, Koutarou Nohara, Yasunobu Sakaguchi, Mikako Takayama, Yohei Fukushige, Syota Soejima, Hidehisa Nakashima, Naoki Kamouchi, Masahiro JMIR Perioper Med Original Paper BACKGROUND: Although machine learning models demonstrate significant potential in predicting postoperative delirium, the advantages of their implementation in real-world settings remain unclear and require a comparison with conventional models in practical applications. OBJECTIVE: The objective of this study was to validate the temporal generalizability of decision tree ensemble and sparse linear regression models for predicting delirium after surgery compared with that of the traditional logistic regression model. METHODS: The health record data of patients hospitalized at an advanced emergency and critical care medical center in Kumamoto, Japan, were collected electronically. We developed a decision tree ensemble model using extreme gradient boosting (XGBoost) and a sparse linear regression model using least absolute shrinkage and selection operator (LASSO) regression. To evaluate the predictive performance of the model, we used the area under the receiver operating characteristic curve (AUROC) and the Matthews correlation coefficient (MCC) to measure discrimination and the slope and intercept of the regression between predicted and observed probabilities to measure calibration. The Brier score was evaluated as an overall performance metric. We included 11,863 consecutive patients who underwent surgery with general anesthesia between December 2017 and February 2022. The patients were divided into a derivation cohort before the COVID-19 pandemic and a validation cohort during the COVID-19 pandemic. Postoperative delirium was diagnosed according to the confusion assessment method. RESULTS: A total of 6497 patients (68.5, SD 14.4 years, women n=2627, 40.4%) were included in the derivation cohort, and 5366 patients (67.8, SD 14.6 years, women n=2105, 39.2%) were included in the validation cohort. Regarding discrimination, the XGBoost model (AUROC 0.87-0.90 and MCC 0.34-0.44) did not significantly outperform the LASSO model (AUROC 0.86-0.89 and MCC 0.34-0.41). The logistic regression model (AUROC 0.84-0.88, MCC 0.33-0.40, slope 1.01-1.19, intercept –0.16 to 0.06, and Brier score 0.06-0.07), with 8 predictors (age, intensive care unit, neurosurgery, emergency admission, anesthesia time, BMI, blood loss during surgery, and use of an ambulance) achieved good predictive performance. CONCLUSIONS: The XGBoost model did not significantly outperform the LASSO model in predicting postoperative delirium. Furthermore, a parsimonious logistic model with a few important predictors achieved comparable performance to machine learning models in predicting postoperative delirium. JMIR Publications 2023-10-26 /pmc/articles/PMC10636625/ /pubmed/37883164 http://dx.doi.org/10.2196/50895 Text en ©Koutarou Matsumoto, Yasunobu Nohara, Mikako Sakaguchi, Yohei Takayama, Syota Fukushige, Hidehisa Soejima, Naoki Nakashima, Masahiro Kamouchi. Originally published in JMIR Perioperative Medicine (http://periop.jmir.org), 26.10.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Perioperative Medicine, is properly cited. The complete bibliographic information, a link to the original publication on http://periop.jmir.org, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Matsumoto, Koutarou Nohara, Yasunobu Sakaguchi, Mikako Takayama, Yohei Fukushige, Syota Soejima, Hidehisa Nakashima, Naoki Kamouchi, Masahiro Temporal Generalizability of Machine Learning Models for Predicting Postoperative Delirium Using Electronic Health Record Data: Model Development and Validation Study |
title | Temporal Generalizability of Machine Learning Models for Predicting Postoperative Delirium Using Electronic Health Record Data: Model Development and Validation Study |
title_full | Temporal Generalizability of Machine Learning Models for Predicting Postoperative Delirium Using Electronic Health Record Data: Model Development and Validation Study |
title_fullStr | Temporal Generalizability of Machine Learning Models for Predicting Postoperative Delirium Using Electronic Health Record Data: Model Development and Validation Study |
title_full_unstemmed | Temporal Generalizability of Machine Learning Models for Predicting Postoperative Delirium Using Electronic Health Record Data: Model Development and Validation Study |
title_short | Temporal Generalizability of Machine Learning Models for Predicting Postoperative Delirium Using Electronic Health Record Data: Model Development and Validation Study |
title_sort | temporal generalizability of machine learning models for predicting postoperative delirium using electronic health record data: model development and validation study |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10636625/ https://www.ncbi.nlm.nih.gov/pubmed/37883164 http://dx.doi.org/10.2196/50895 |
work_keys_str_mv | AT matsumotokoutarou temporalgeneralizabilityofmachinelearningmodelsforpredictingpostoperativedeliriumusingelectronichealthrecorddatamodeldevelopmentandvalidationstudy AT noharayasunobu temporalgeneralizabilityofmachinelearningmodelsforpredictingpostoperativedeliriumusingelectronichealthrecorddatamodeldevelopmentandvalidationstudy AT sakaguchimikako temporalgeneralizabilityofmachinelearningmodelsforpredictingpostoperativedeliriumusingelectronichealthrecorddatamodeldevelopmentandvalidationstudy AT takayamayohei temporalgeneralizabilityofmachinelearningmodelsforpredictingpostoperativedeliriumusingelectronichealthrecorddatamodeldevelopmentandvalidationstudy AT fukushigesyota temporalgeneralizabilityofmachinelearningmodelsforpredictingpostoperativedeliriumusingelectronichealthrecorddatamodeldevelopmentandvalidationstudy AT soejimahidehisa temporalgeneralizabilityofmachinelearningmodelsforpredictingpostoperativedeliriumusingelectronichealthrecorddatamodeldevelopmentandvalidationstudy AT nakashimanaoki temporalgeneralizabilityofmachinelearningmodelsforpredictingpostoperativedeliriumusingelectronichealthrecorddatamodeldevelopmentandvalidationstudy AT kamouchimasahiro temporalgeneralizabilityofmachinelearningmodelsforpredictingpostoperativedeliriumusingelectronichealthrecorddatamodeldevelopmentandvalidationstudy |