Cargando…

Using Transfer Learning for Improved Mortality Prediction in a Data-Scarce Hospital Setting

Algorithm–based clinical decision support (CDS) systems associate patient-derived health data with outcomes of interest, such as in-hospital mortality. However, the quality of such associations often depends on the availability of site-specific training data. Without sufficient quantities of data, t...

Descripción completa

Detalles Bibliográficos
Autores principales: Desautels, Thomas, Calvert, Jacob, Hoffman, Jana, Mao, Qingqing, Jay, Melissa, Fletcher, Grant, Barton, Chris, Chettipally, Uli, Kerem, Yaniv, Das, Ritankar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5470861/
https://www.ncbi.nlm.nih.gov/pubmed/28638239
http://dx.doi.org/10.1177/1178222617712994
_version_ 1783243836692103168
author Desautels, Thomas
Calvert, Jacob
Hoffman, Jana
Mao, Qingqing
Jay, Melissa
Fletcher, Grant
Barton, Chris
Chettipally, Uli
Kerem, Yaniv
Das, Ritankar
author_facet Desautels, Thomas
Calvert, Jacob
Hoffman, Jana
Mao, Qingqing
Jay, Melissa
Fletcher, Grant
Barton, Chris
Chettipally, Uli
Kerem, Yaniv
Das, Ritankar
author_sort Desautels, Thomas
collection PubMed
description Algorithm–based clinical decision support (CDS) systems associate patient-derived health data with outcomes of interest, such as in-hospital mortality. However, the quality of such associations often depends on the availability of site-specific training data. Without sufficient quantities of data, the underlying statistical apparatus cannot differentiate useful patterns from noise and, as a result, may underperform. This initial training data burden limits the widespread, out-of-the-box, use of machine learning–based risk scoring systems. In this study, we implement a statistical transfer learning technique, which uses a large “source” data set to drastically reduce the amount of data needed to perform well on a “target” site for which training data are scarce. We test this transfer technique with AutoTriage, a mortality prediction algorithm, on patient charts from the Beth Israel Deaconess Medical Center (the source) and a population of 48 249 adult inpatients from University of California San Francisco Medical Center (the target institution). We find that the amount of training data required to surpass 0.80 area under the receiver operating characteristic (AUROC) on the target set decreases from more than 4000 patients to fewer than 220. This performance is superior to the Modified Early Warning Score (AUROC: 0.76) and corresponds to a decrease in clinical data collection time from approximately 6 months to less than 10 days. Our results highlight the usefulness of transfer learning in the specialization of CDS systems to new hospital sites, without requiring expensive and time-consuming data collection efforts.
format Online
Article
Text
id pubmed-5470861
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-54708612017-06-21 Using Transfer Learning for Improved Mortality Prediction in a Data-Scarce Hospital Setting Desautels, Thomas Calvert, Jacob Hoffman, Jana Mao, Qingqing Jay, Melissa Fletcher, Grant Barton, Chris Chettipally, Uli Kerem, Yaniv Das, Ritankar Biomed Inform Insights Original Research Algorithm–based clinical decision support (CDS) systems associate patient-derived health data with outcomes of interest, such as in-hospital mortality. However, the quality of such associations often depends on the availability of site-specific training data. Without sufficient quantities of data, the underlying statistical apparatus cannot differentiate useful patterns from noise and, as a result, may underperform. This initial training data burden limits the widespread, out-of-the-box, use of machine learning–based risk scoring systems. In this study, we implement a statistical transfer learning technique, which uses a large “source” data set to drastically reduce the amount of data needed to perform well on a “target” site for which training data are scarce. We test this transfer technique with AutoTriage, a mortality prediction algorithm, on patient charts from the Beth Israel Deaconess Medical Center (the source) and a population of 48 249 adult inpatients from University of California San Francisco Medical Center (the target institution). We find that the amount of training data required to surpass 0.80 area under the receiver operating characteristic (AUROC) on the target set decreases from more than 4000 patients to fewer than 220. This performance is superior to the Modified Early Warning Score (AUROC: 0.76) and corresponds to a decrease in clinical data collection time from approximately 6 months to less than 10 days. Our results highlight the usefulness of transfer learning in the specialization of CDS systems to new hospital sites, without requiring expensive and time-consuming data collection efforts. SAGE Publications 2017-06-12 /pmc/articles/PMC5470861/ /pubmed/28638239 http://dx.doi.org/10.1177/1178222617712994 Text en © The Author(s) 2017 This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (http://www.creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page(https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Original Research
Desautels, Thomas
Calvert, Jacob
Hoffman, Jana
Mao, Qingqing
Jay, Melissa
Fletcher, Grant
Barton, Chris
Chettipally, Uli
Kerem, Yaniv
Das, Ritankar
Using Transfer Learning for Improved Mortality Prediction in a Data-Scarce Hospital Setting
title Using Transfer Learning for Improved Mortality Prediction in a Data-Scarce Hospital Setting
title_full Using Transfer Learning for Improved Mortality Prediction in a Data-Scarce Hospital Setting
title_fullStr Using Transfer Learning for Improved Mortality Prediction in a Data-Scarce Hospital Setting
title_full_unstemmed Using Transfer Learning for Improved Mortality Prediction in a Data-Scarce Hospital Setting
title_short Using Transfer Learning for Improved Mortality Prediction in a Data-Scarce Hospital Setting
title_sort using transfer learning for improved mortality prediction in a data-scarce hospital setting
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5470861/
https://www.ncbi.nlm.nih.gov/pubmed/28638239
http://dx.doi.org/10.1177/1178222617712994
work_keys_str_mv AT desautelsthomas usingtransferlearningforimprovedmortalitypredictioninadatascarcehospitalsetting
AT calvertjacob usingtransferlearningforimprovedmortalitypredictioninadatascarcehospitalsetting
AT hoffmanjana usingtransferlearningforimprovedmortalitypredictioninadatascarcehospitalsetting
AT maoqingqing usingtransferlearningforimprovedmortalitypredictioninadatascarcehospitalsetting
AT jaymelissa usingtransferlearningforimprovedmortalitypredictioninadatascarcehospitalsetting
AT fletchergrant usingtransferlearningforimprovedmortalitypredictioninadatascarcehospitalsetting
AT bartonchris usingtransferlearningforimprovedmortalitypredictioninadatascarcehospitalsetting
AT chettipallyuli usingtransferlearningforimprovedmortalitypredictioninadatascarcehospitalsetting
AT keremyaniv usingtransferlearningforimprovedmortalitypredictioninadatascarcehospitalsetting
AT dasritankar usingtransferlearningforimprovedmortalitypredictioninadatascarcehospitalsetting