Cargando…

Assessing the Generalizability of a Clinical Machine Learning Model Across Multiple Emergency Departments

OBJECTIVE: To assess the generalizability of a clinical machine learning algorithm across multiple emergency departments (EDs). PATIENTS AND METHODS: We obtained data on all ED visits at our health care system’s largest ED from May 5, 2018, to December 31, 2019. We also obtained data from 3 satellit...

Descripción completa

Detalles Bibliográficos
Autores principales: Ryu, Alexander J., Romero-Brufau, Santiago, Qian, Ray, Heaton, Heather A., Nestler, David M., Ayanian, Shant, Kingsley, Thomas C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9062323/
https://www.ncbi.nlm.nih.gov/pubmed/35517246
http://dx.doi.org/10.1016/j.mayocpiqo.2022.03.003
_version_ 1784698908785508352
author Ryu, Alexander J.
Romero-Brufau, Santiago
Qian, Ray
Heaton, Heather A.
Nestler, David M.
Ayanian, Shant
Kingsley, Thomas C.
author_facet Ryu, Alexander J.
Romero-Brufau, Santiago
Qian, Ray
Heaton, Heather A.
Nestler, David M.
Ayanian, Shant
Kingsley, Thomas C.
author_sort Ryu, Alexander J.
collection PubMed
description OBJECTIVE: To assess the generalizability of a clinical machine learning algorithm across multiple emergency departments (EDs). PATIENTS AND METHODS: We obtained data on all ED visits at our health care system’s largest ED from May 5, 2018, to December 31, 2019. We also obtained data from 3 satellite EDs and 1 distant-hub ED from May 1, 2018, to December 31, 2018. A gradient-boosted machine model was trained on pooled data from the included EDs. To prevent the effect of differing training set sizes, the data were randomly downsampled to match those of our smallest ED. A second model was trained on this downsampled, pooled data. The model’s performance was compared using area under the receiver operating characteristic (AUC). Finally, site-specific models were trained and tested across all the sites, and the importance of features was examined to understand the reasons for differing generalizability. RESULTS: The training data sets contained 1918-64,161 ED visits. The AUC for the pooled model ranged from 0.84 to 0.94 across the sites; the performance decreased slightly when Ns were downsampled to match those of our smallest ED site. When site-specific models were trained and tested across all the sites, the AUCs ranged more widely from 0.71 to 0.93. Within a single ED site, the performance of the 5 site-specific models was most variable for our largest and smallest EDs. Finally, when the importance of features was examined, several features were common to all site-specific models; however, the weight of these features differed. CONCLUSION: A machine learning model for predicting hospital admission from the ED will generalize fairly well within the health care system but will still have significant differences in AUC performance across sites because of site-specific factors.
format Online
Article
Text
id pubmed-9062323
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-90623232022-05-04 Assessing the Generalizability of a Clinical Machine Learning Model Across Multiple Emergency Departments Ryu, Alexander J. Romero-Brufau, Santiago Qian, Ray Heaton, Heather A. Nestler, David M. Ayanian, Shant Kingsley, Thomas C. Mayo Clin Proc Innov Qual Outcomes Original Article OBJECTIVE: To assess the generalizability of a clinical machine learning algorithm across multiple emergency departments (EDs). PATIENTS AND METHODS: We obtained data on all ED visits at our health care system’s largest ED from May 5, 2018, to December 31, 2019. We also obtained data from 3 satellite EDs and 1 distant-hub ED from May 1, 2018, to December 31, 2018. A gradient-boosted machine model was trained on pooled data from the included EDs. To prevent the effect of differing training set sizes, the data were randomly downsampled to match those of our smallest ED. A second model was trained on this downsampled, pooled data. The model’s performance was compared using area under the receiver operating characteristic (AUC). Finally, site-specific models were trained and tested across all the sites, and the importance of features was examined to understand the reasons for differing generalizability. RESULTS: The training data sets contained 1918-64,161 ED visits. The AUC for the pooled model ranged from 0.84 to 0.94 across the sites; the performance decreased slightly when Ns were downsampled to match those of our smallest ED site. When site-specific models were trained and tested across all the sites, the AUCs ranged more widely from 0.71 to 0.93. Within a single ED site, the performance of the 5 site-specific models was most variable for our largest and smallest EDs. Finally, when the importance of features was examined, several features were common to all site-specific models; however, the weight of these features differed. CONCLUSION: A machine learning model for predicting hospital admission from the ED will generalize fairly well within the health care system but will still have significant differences in AUC performance across sites because of site-specific factors. Elsevier 2022-04-26 /pmc/articles/PMC9062323/ /pubmed/35517246 http://dx.doi.org/10.1016/j.mayocpiqo.2022.03.003 Text en © 2022 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Original Article
Ryu, Alexander J.
Romero-Brufau, Santiago
Qian, Ray
Heaton, Heather A.
Nestler, David M.
Ayanian, Shant
Kingsley, Thomas C.
Assessing the Generalizability of a Clinical Machine Learning Model Across Multiple Emergency Departments
title Assessing the Generalizability of a Clinical Machine Learning Model Across Multiple Emergency Departments
title_full Assessing the Generalizability of a Clinical Machine Learning Model Across Multiple Emergency Departments
title_fullStr Assessing the Generalizability of a Clinical Machine Learning Model Across Multiple Emergency Departments
title_full_unstemmed Assessing the Generalizability of a Clinical Machine Learning Model Across Multiple Emergency Departments
title_short Assessing the Generalizability of a Clinical Machine Learning Model Across Multiple Emergency Departments
title_sort assessing the generalizability of a clinical machine learning model across multiple emergency departments
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9062323/
https://www.ncbi.nlm.nih.gov/pubmed/35517246
http://dx.doi.org/10.1016/j.mayocpiqo.2022.03.003
work_keys_str_mv AT ryualexanderj assessingthegeneralizabilityofaclinicalmachinelearningmodelacrossmultipleemergencydepartments
AT romerobrufausantiago assessingthegeneralizabilityofaclinicalmachinelearningmodelacrossmultipleemergencydepartments
AT qianray assessingthegeneralizabilityofaclinicalmachinelearningmodelacrossmultipleemergencydepartments
AT heatonheathera assessingthegeneralizabilityofaclinicalmachinelearningmodelacrossmultipleemergencydepartments
AT nestlerdavidm assessingthegeneralizabilityofaclinicalmachinelearningmodelacrossmultipleemergencydepartments
AT ayanianshant assessingthegeneralizabilityofaclinicalmachinelearningmodelacrossmultipleemergencydepartments
AT kingsleythomasc assessingthegeneralizabilityofaclinicalmachinelearningmodelacrossmultipleemergencydepartments