Cargando…

Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening

As patient health information is highly regulated due to privacy concerns, most machine learning (ML)-based healthcare studies are unable to test on external patient cohorts, resulting in a gap between locally reported model performance and cross-site generalizability. Different approaches have been...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Jenny, Soltan, Andrew A. S., Clifton, David A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9174159/
https://www.ncbi.nlm.nih.gov/pubmed/35672368
http://dx.doi.org/10.1038/s41746-022-00614-9
_version_ 1784722179120693248
author Yang, Jenny
Soltan, Andrew A. S.
Clifton, David A.
author_facet Yang, Jenny
Soltan, Andrew A. S.
Clifton, David A.
author_sort Yang, Jenny
collection PubMed
description As patient health information is highly regulated due to privacy concerns, most machine learning (ML)-based healthcare studies are unable to test on external patient cohorts, resulting in a gap between locally reported model performance and cross-site generalizability. Different approaches have been introduced for developing models across multiple clinical sites, however less attention has been given to adopting ready-made models in new settings. We introduce three methods to do this—(1) applying a ready-made model “as-is” (2); readjusting the decision threshold on the model’s output using site-specific data and (3); finetuning the model using site-specific data via transfer learning. Using a case study of COVID-19 diagnosis across four NHS Hospital Trusts, we show that all methods achieve clinically-effective performances (NPV > 0.959), with transfer learning achieving the best results (mean AUROCs between 0.870 and 0.925). Our models demonstrate that site-specific customization improves predictive performance when compared to other ready-made approaches.
format Online
Article
Text
id pubmed-9174159
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-91741592022-06-09 Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening Yang, Jenny Soltan, Andrew A. S. Clifton, David A. NPJ Digit Med Article As patient health information is highly regulated due to privacy concerns, most machine learning (ML)-based healthcare studies are unable to test on external patient cohorts, resulting in a gap between locally reported model performance and cross-site generalizability. Different approaches have been introduced for developing models across multiple clinical sites, however less attention has been given to adopting ready-made models in new settings. We introduce three methods to do this—(1) applying a ready-made model “as-is” (2); readjusting the decision threshold on the model’s output using site-specific data and (3); finetuning the model using site-specific data via transfer learning. Using a case study of COVID-19 diagnosis across four NHS Hospital Trusts, we show that all methods achieve clinically-effective performances (NPV > 0.959), with transfer learning achieving the best results (mean AUROCs between 0.870 and 0.925). Our models demonstrate that site-specific customization improves predictive performance when compared to other ready-made approaches. Nature Publishing Group UK 2022-06-07 /pmc/articles/PMC9174159/ /pubmed/35672368 http://dx.doi.org/10.1038/s41746-022-00614-9 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Yang, Jenny
Soltan, Andrew A. S.
Clifton, David A.
Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening
title Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening
title_full Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening
title_fullStr Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening
title_full_unstemmed Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening
title_short Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening
title_sort machine learning generalizability across healthcare settings: insights from multi-site covid-19 screening
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9174159/
https://www.ncbi.nlm.nih.gov/pubmed/35672368
http://dx.doi.org/10.1038/s41746-022-00614-9
work_keys_str_mv AT yangjenny machinelearninggeneralizabilityacrosshealthcaresettingsinsightsfrommultisitecovid19screening
AT soltanandrewas machinelearninggeneralizabilityacrosshealthcaresettingsinsightsfrommultisitecovid19screening
AT cliftondavida machinelearninggeneralizabilityacrosshealthcaresettingsinsightsfrommultisitecovid19screening