Cargando…

Semisupervised Deep Learning Techniques for Predicting Acute Respiratory Distress Syndrome From Time-Series Clinical Data: Model Development and Validation Study

BACKGROUND: A high number of patients who are hospitalized with COVID-19 develop acute respiratory distress syndrome (ARDS). OBJECTIVE: In response to the need for clinical decision support tools to help manage the next pandemic during the early stages (ie, when limited labeled data are present), we...

Descripción completa

Detalles Bibliográficos
Autores principales:	Lam, Carson, Tso, Chak Foon, Green-Saxena, Abigail, Pellegrini, Emily, Iqbal, Zohora, Evans, Daniel, Hoffman, Jana, Calvert, Jacob, Mao, Qingqing, Das, Ritankar
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	JMIR Publications 2021
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8447921/ https://www.ncbi.nlm.nih.gov/pubmed/34398784 http://dx.doi.org/10.2196/28028

_version_	1784569120148160512
author	Lam, Carson Tso, Chak Foon Green-Saxena, Abigail Pellegrini, Emily Iqbal, Zohora Evans, Daniel Hoffman, Jana Calvert, Jacob Mao, Qingqing Das, Ritankar
author_facet	Lam, Carson Tso, Chak Foon Green-Saxena, Abigail Pellegrini, Emily Iqbal, Zohora Evans, Daniel Hoffman, Jana Calvert, Jacob Mao, Qingqing Das, Ritankar
author_sort	Lam, Carson
collection	PubMed
description	BACKGROUND: A high number of patients who are hospitalized with COVID-19 develop acute respiratory distress syndrome (ARDS). OBJECTIVE: In response to the need for clinical decision support tools to help manage the next pandemic during the early stages (ie, when limited labeled data are present), we developed machine learning algorithms that use semisupervised learning (SSL) techniques to predict ARDS development in general and COVID-19 populations based on limited labeled data. METHODS: SSL techniques were applied to 29,127 encounters with patients who were admitted to 7 US hospitals from May 1, 2019, to May 1, 2021. A recurrent neural network that used a time series of electronic health record data was applied to data that were collected when a patient’s peripheral oxygen saturation level fell below the normal range (<97%) to predict the subsequent development of ARDS during the remaining duration of patients’ hospital stay. Model performance was assessed with the area under the receiver operating characteristic curve and area under the precision recall curve of an external hold-out test set. RESULTS: For the whole data set, the median time between the first peripheral oxygen saturation measurement of <97% and subsequent respiratory failure was 21 hours. The area under the receiver operating characteristic curve for predicting subsequent ARDS development was 0.73 when the model was trained on a labeled data set of 6930 patients, 0.78 when the model was trained on the labeled data set that had been augmented with the unlabeled data set of 16,173 patients by using SSL techniques, and 0.84 when the model was trained on the entire training set of 23,103 labeled patients. CONCLUSIONS: In the context of using time-series inpatient data and a careful model training design, unlabeled data can be used to improve the performance of machine learning models when labeled data for predicting ARDS development are scarce or expensive.
format	Online Article Text
id	pubmed-8447921
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	JMIR Publications
record_format	MEDLINE/PubMed
spelling	pubmed-84479212021-10-06 Semisupervised Deep Learning Techniques for Predicting Acute Respiratory Distress Syndrome From Time-Series Clinical Data: Model Development and Validation Study Lam, Carson Tso, Chak Foon Green-Saxena, Abigail Pellegrini, Emily Iqbal, Zohora Evans, Daniel Hoffman, Jana Calvert, Jacob Mao, Qingqing Das, Ritankar JMIR Form Res Original Paper BACKGROUND: A high number of patients who are hospitalized with COVID-19 develop acute respiratory distress syndrome (ARDS). OBJECTIVE: In response to the need for clinical decision support tools to help manage the next pandemic during the early stages (ie, when limited labeled data are present), we developed machine learning algorithms that use semisupervised learning (SSL) techniques to predict ARDS development in general and COVID-19 populations based on limited labeled data. METHODS: SSL techniques were applied to 29,127 encounters with patients who were admitted to 7 US hospitals from May 1, 2019, to May 1, 2021. A recurrent neural network that used a time series of electronic health record data was applied to data that were collected when a patient’s peripheral oxygen saturation level fell below the normal range (<97%) to predict the subsequent development of ARDS during the remaining duration of patients’ hospital stay. Model performance was assessed with the area under the receiver operating characteristic curve and area under the precision recall curve of an external hold-out test set. RESULTS: For the whole data set, the median time between the first peripheral oxygen saturation measurement of <97% and subsequent respiratory failure was 21 hours. The area under the receiver operating characteristic curve for predicting subsequent ARDS development was 0.73 when the model was trained on a labeled data set of 6930 patients, 0.78 when the model was trained on the labeled data set that had been augmented with the unlabeled data set of 16,173 patients by using SSL techniques, and 0.84 when the model was trained on the entire training set of 23,103 labeled patients. CONCLUSIONS: In the context of using time-series inpatient data and a careful model training design, unlabeled data can be used to improve the performance of machine learning models when labeled data for predicting ARDS development are scarce or expensive. JMIR Publications 2021-09-14 /pmc/articles/PMC8447921/ /pubmed/34398784 http://dx.doi.org/10.2196/28028 Text en ©Carson Lam, Chak Foon Tso, Abigail Green-Saxena, Emily Pellegrini, Zohora Iqbal, Daniel Evans, Jana Hoffman, Jacob Calvert, Qingqing Mao, Ritankar Das. Originally published in JMIR Formative Research (https://formative.jmir.org), 14.09.2021. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Formative Research, is properly cited. The complete bibliographic information, a link to the original publication on https://formative.jmir.org, as well as this copyright and license information must be included.
spellingShingle	Original Paper Lam, Carson Tso, Chak Foon Green-Saxena, Abigail Pellegrini, Emily Iqbal, Zohora Evans, Daniel Hoffman, Jana Calvert, Jacob Mao, Qingqing Das, Ritankar Semisupervised Deep Learning Techniques for Predicting Acute Respiratory Distress Syndrome From Time-Series Clinical Data: Model Development and Validation Study
title	Semisupervised Deep Learning Techniques for Predicting Acute Respiratory Distress Syndrome From Time-Series Clinical Data: Model Development and Validation Study
title_full	Semisupervised Deep Learning Techniques for Predicting Acute Respiratory Distress Syndrome From Time-Series Clinical Data: Model Development and Validation Study
title_fullStr	Semisupervised Deep Learning Techniques for Predicting Acute Respiratory Distress Syndrome From Time-Series Clinical Data: Model Development and Validation Study
title_full_unstemmed	Semisupervised Deep Learning Techniques for Predicting Acute Respiratory Distress Syndrome From Time-Series Clinical Data: Model Development and Validation Study
title_short	Semisupervised Deep Learning Techniques for Predicting Acute Respiratory Distress Syndrome From Time-Series Clinical Data: Model Development and Validation Study
title_sort	semisupervised deep learning techniques for predicting acute respiratory distress syndrome from time-series clinical data: model development and validation study
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8447921/ https://www.ncbi.nlm.nih.gov/pubmed/34398784 http://dx.doi.org/10.2196/28028
work_keys_str_mv	AT lamcarson semisuperviseddeeplearningtechniquesforpredictingacuterespiratorydistresssyndromefromtimeseriesclinicaldatamodeldevelopmentandvalidationstudy AT tsochakfoon semisuperviseddeeplearningtechniquesforpredictingacuterespiratorydistresssyndromefromtimeseriesclinicaldatamodeldevelopmentandvalidationstudy AT greensaxenaabigail semisuperviseddeeplearningtechniquesforpredictingacuterespiratorydistresssyndromefromtimeseriesclinicaldatamodeldevelopmentandvalidationstudy AT pellegriniemily semisuperviseddeeplearningtechniquesforpredictingacuterespiratorydistresssyndromefromtimeseriesclinicaldatamodeldevelopmentandvalidationstudy AT iqbalzohora semisuperviseddeeplearningtechniquesforpredictingacuterespiratorydistresssyndromefromtimeseriesclinicaldatamodeldevelopmentandvalidationstudy AT evansdaniel semisuperviseddeeplearningtechniquesforpredictingacuterespiratorydistresssyndromefromtimeseriesclinicaldatamodeldevelopmentandvalidationstudy AT hoffmanjana semisuperviseddeeplearningtechniquesforpredictingacuterespiratorydistresssyndromefromtimeseriesclinicaldatamodeldevelopmentandvalidationstudy AT calvertjacob semisuperviseddeeplearningtechniquesforpredictingacuterespiratorydistresssyndromefromtimeseriesclinicaldatamodeldevelopmentandvalidationstudy AT maoqingqing semisuperviseddeeplearningtechniquesforpredictingacuterespiratorydistresssyndromefromtimeseriesclinicaldatamodeldevelopmentandvalidationstudy AT dasritankar semisuperviseddeeplearningtechniquesforpredictingacuterespiratorydistresssyndromefromtimeseriesclinicaldatamodeldevelopmentandvalidationstudy

Semisupervised Deep Learning Techniques for Predicting Acute Respiratory Distress Syndrome From Time-Series Clinical Data: Model Development and Validation Study

Ejemplares similares