Cargando…
Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening
As patient health information is highly regulated due to privacy concerns, most machine learning (ML)-based healthcare studies are unable to test on external patient cohorts, resulting in a gap between locally reported model performance and cross-site generalizability. Different approaches have been...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9174159/ https://www.ncbi.nlm.nih.gov/pubmed/35672368 http://dx.doi.org/10.1038/s41746-022-00614-9 |
_version_ | 1784722179120693248 |
---|---|
author | Yang, Jenny Soltan, Andrew A. S. Clifton, David A. |
author_facet | Yang, Jenny Soltan, Andrew A. S. Clifton, David A. |
author_sort | Yang, Jenny |
collection | PubMed |
description | As patient health information is highly regulated due to privacy concerns, most machine learning (ML)-based healthcare studies are unable to test on external patient cohorts, resulting in a gap between locally reported model performance and cross-site generalizability. Different approaches have been introduced for developing models across multiple clinical sites, however less attention has been given to adopting ready-made models in new settings. We introduce three methods to do this—(1) applying a ready-made model “as-is” (2); readjusting the decision threshold on the model’s output using site-specific data and (3); finetuning the model using site-specific data via transfer learning. Using a case study of COVID-19 diagnosis across four NHS Hospital Trusts, we show that all methods achieve clinically-effective performances (NPV > 0.959), with transfer learning achieving the best results (mean AUROCs between 0.870 and 0.925). Our models demonstrate that site-specific customization improves predictive performance when compared to other ready-made approaches. |
format | Online Article Text |
id | pubmed-9174159 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-91741592022-06-09 Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening Yang, Jenny Soltan, Andrew A. S. Clifton, David A. NPJ Digit Med Article As patient health information is highly regulated due to privacy concerns, most machine learning (ML)-based healthcare studies are unable to test on external patient cohorts, resulting in a gap between locally reported model performance and cross-site generalizability. Different approaches have been introduced for developing models across multiple clinical sites, however less attention has been given to adopting ready-made models in new settings. We introduce three methods to do this—(1) applying a ready-made model “as-is” (2); readjusting the decision threshold on the model’s output using site-specific data and (3); finetuning the model using site-specific data via transfer learning. Using a case study of COVID-19 diagnosis across four NHS Hospital Trusts, we show that all methods achieve clinically-effective performances (NPV > 0.959), with transfer learning achieving the best results (mean AUROCs between 0.870 and 0.925). Our models demonstrate that site-specific customization improves predictive performance when compared to other ready-made approaches. Nature Publishing Group UK 2022-06-07 /pmc/articles/PMC9174159/ /pubmed/35672368 http://dx.doi.org/10.1038/s41746-022-00614-9 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Yang, Jenny Soltan, Andrew A. S. Clifton, David A. Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening |
title | Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening |
title_full | Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening |
title_fullStr | Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening |
title_full_unstemmed | Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening |
title_short | Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening |
title_sort | machine learning generalizability across healthcare settings: insights from multi-site covid-19 screening |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9174159/ https://www.ncbi.nlm.nih.gov/pubmed/35672368 http://dx.doi.org/10.1038/s41746-022-00614-9 |
work_keys_str_mv | AT yangjenny machinelearninggeneralizabilityacrosshealthcaresettingsinsightsfrommultisitecovid19screening AT soltanandrewas machinelearninggeneralizabilityacrosshealthcaresettingsinsightsfrommultisitecovid19screening AT cliftondavida machinelearninggeneralizabilityacrosshealthcaresettingsinsightsfrommultisitecovid19screening |