Cargando…

A New Insight Into Missing Data in Intensive Care Unit Patient Profiles: Observational Study

BACKGROUND: The data missing from patient profiles in intensive care units (ICUs) are substantial and unavoidable. However, this incompleteness is not always random or because of imperfections in the data collection process. OBJECTIVE: This study aimed to investigate the potential hidden information...

Descripción completa

Detalles Bibliográficos
Autores principales: Sharafoddini, Anis, Dubin, Joel A, Maslove, David M, Lee, Joon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6329436/
https://www.ncbi.nlm.nih.gov/pubmed/30622091
http://dx.doi.org/10.2196/11605
_version_ 1783386830222131200
author Sharafoddini, Anis
Dubin, Joel A
Maslove, David M
Lee, Joon
author_facet Sharafoddini, Anis
Dubin, Joel A
Maslove, David M
Lee, Joon
author_sort Sharafoddini, Anis
collection PubMed
description BACKGROUND: The data missing from patient profiles in intensive care units (ICUs) are substantial and unavoidable. However, this incompleteness is not always random or because of imperfections in the data collection process. OBJECTIVE: This study aimed to investigate the potential hidden information in data missing from electronic health records (EHRs) in an ICU and examine whether the presence or missingness of a variable itself can convey information about the patient health status. METHODS: Daily retrieval of laboratory test (LT) measurements from the Medical Information Mart for Intensive Care III database was set as our reference for defining complete patient profiles. Missingness indicators were introduced as a way of representing presence or absence of the LTs in a patient profile. Thereafter, various feature selection methods (filter and embedded feature selection methods) were used to examine the predictive power of missingness indicators. Finally, a set of well-known prediction models (logistic regression [LR], decision tree, and random forest) were used to evaluate whether the absence status itself of a variable recording can provide predictive power. We also examined the utility of missingness indicators in improving predictive performance when used with observed laboratory measurements as model input. The outcome of interest was in-hospital mortality and mortality at 30 days after ICU discharge. RESULTS: Regardless of mortality type or ICU day, more than 40% of the predictors selected by feature selection methods were missingness indicators. Notably, employing missingness indicators as the only predictors achieved reasonable mortality prediction on all days and for all mortality types (for instance, in 30-day mortality prediction with LR, we achieved area under the curve of the receiver operating characteristic [AUROC] of 0.6836±0.012). Including indicators with observed measurements in the prediction models also improved the AUROC; the maximum improvement was 0.0426. Indicators also improved the AUROC for Simplified Acute Physiology Score II model—a well-known ICU severity of illness score—confirming the additive information of the indicators (AUROC of 0.8045±0.0109 for 30-day mortality prediction for LR). CONCLUSIONS: Our study demonstrated that the presence or absence of LT measurements is informative and can be considered a potential predictor of in-hospital and 30-day mortality. The comparative analysis of prediction models also showed statistically significant prediction improvement when indicators were included. Moreover, missing data might reflect the opinions of examining clinicians. Therefore, the absence of measurements can be informative in ICUs and has predictive power beyond the measured data themselves. This initial case study shows promise for more in-depth analysis of missing data and its informativeness in ICUs. Future studies are needed to generalize these results.
format Online
Article
Text
id pubmed-6329436
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-63294362019-02-11 A New Insight Into Missing Data in Intensive Care Unit Patient Profiles: Observational Study Sharafoddini, Anis Dubin, Joel A Maslove, David M Lee, Joon JMIR Med Inform Original Paper BACKGROUND: The data missing from patient profiles in intensive care units (ICUs) are substantial and unavoidable. However, this incompleteness is not always random or because of imperfections in the data collection process. OBJECTIVE: This study aimed to investigate the potential hidden information in data missing from electronic health records (EHRs) in an ICU and examine whether the presence or missingness of a variable itself can convey information about the patient health status. METHODS: Daily retrieval of laboratory test (LT) measurements from the Medical Information Mart for Intensive Care III database was set as our reference for defining complete patient profiles. Missingness indicators were introduced as a way of representing presence or absence of the LTs in a patient profile. Thereafter, various feature selection methods (filter and embedded feature selection methods) were used to examine the predictive power of missingness indicators. Finally, a set of well-known prediction models (logistic regression [LR], decision tree, and random forest) were used to evaluate whether the absence status itself of a variable recording can provide predictive power. We also examined the utility of missingness indicators in improving predictive performance when used with observed laboratory measurements as model input. The outcome of interest was in-hospital mortality and mortality at 30 days after ICU discharge. RESULTS: Regardless of mortality type or ICU day, more than 40% of the predictors selected by feature selection methods were missingness indicators. Notably, employing missingness indicators as the only predictors achieved reasonable mortality prediction on all days and for all mortality types (for instance, in 30-day mortality prediction with LR, we achieved area under the curve of the receiver operating characteristic [AUROC] of 0.6836±0.012). Including indicators with observed measurements in the prediction models also improved the AUROC; the maximum improvement was 0.0426. Indicators also improved the AUROC for Simplified Acute Physiology Score II model—a well-known ICU severity of illness score—confirming the additive information of the indicators (AUROC of 0.8045±0.0109 for 30-day mortality prediction for LR). CONCLUSIONS: Our study demonstrated that the presence or absence of LT measurements is informative and can be considered a potential predictor of in-hospital and 30-day mortality. The comparative analysis of prediction models also showed statistically significant prediction improvement when indicators were included. Moreover, missing data might reflect the opinions of examining clinicians. Therefore, the absence of measurements can be informative in ICUs and has predictive power beyond the measured data themselves. This initial case study shows promise for more in-depth analysis of missing data and its informativeness in ICUs. Future studies are needed to generalize these results. JMIR Publications 2019-01-08 /pmc/articles/PMC6329436/ /pubmed/30622091 http://dx.doi.org/10.2196/11605 Text en ©Anis Sharafoddini, Joel A Dubin, David M Maslove, Joon Lee. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 08.01.2019. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Sharafoddini, Anis
Dubin, Joel A
Maslove, David M
Lee, Joon
A New Insight Into Missing Data in Intensive Care Unit Patient Profiles: Observational Study
title A New Insight Into Missing Data in Intensive Care Unit Patient Profiles: Observational Study
title_full A New Insight Into Missing Data in Intensive Care Unit Patient Profiles: Observational Study
title_fullStr A New Insight Into Missing Data in Intensive Care Unit Patient Profiles: Observational Study
title_full_unstemmed A New Insight Into Missing Data in Intensive Care Unit Patient Profiles: Observational Study
title_short A New Insight Into Missing Data in Intensive Care Unit Patient Profiles: Observational Study
title_sort new insight into missing data in intensive care unit patient profiles: observational study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6329436/
https://www.ncbi.nlm.nih.gov/pubmed/30622091
http://dx.doi.org/10.2196/11605
work_keys_str_mv AT sharafoddinianis anewinsightintomissingdatainintensivecareunitpatientprofilesobservationalstudy
AT dubinjoela anewinsightintomissingdatainintensivecareunitpatientprofilesobservationalstudy
AT maslovedavidm anewinsightintomissingdatainintensivecareunitpatientprofilesobservationalstudy
AT leejoon anewinsightintomissingdatainintensivecareunitpatientprofilesobservationalstudy
AT sharafoddinianis newinsightintomissingdatainintensivecareunitpatientprofilesobservationalstudy
AT dubinjoela newinsightintomissingdatainintensivecareunitpatientprofilesobservationalstudy
AT maslovedavidm newinsightintomissingdatainintensivecareunitpatientprofilesobservationalstudy
AT leejoon newinsightintomissingdatainintensivecareunitpatientprofilesobservationalstudy