Cargando…

An exploratory data quality analysis of time series physiologic signals using a large-scale intensive care unit database

Physiological data, such as heart rate and blood pressure, are critical to clinical decision-making in the intensive care unit (ICU). Vital signs data, which are available from electronic health records, can be used to diagnose and predict important clinical outcomes; While there have been some repo...

Descripción completa

Detalles Bibliográficos
Autores principales: Afshar, Ali S, Li, Yijun, Chen, Zixu, Chen, Yuxuan, Lee, Jae Hun, Irani, Darius, Crank, Aidan, Singh, Digvijay, Kanter, Michael, Faraday, Nauder, Kharrazi, Hadi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8327372/
https://www.ncbi.nlm.nih.gov/pubmed/34350392
http://dx.doi.org/10.1093/jamiaopen/ooab057
_version_ 1783732060997812224
author Afshar, Ali S
Li, Yijun
Chen, Zixu
Chen, Yuxuan
Lee, Jae Hun
Irani, Darius
Crank, Aidan
Singh, Digvijay
Kanter, Michael
Faraday, Nauder
Kharrazi, Hadi
author_facet Afshar, Ali S
Li, Yijun
Chen, Zixu
Chen, Yuxuan
Lee, Jae Hun
Irani, Darius
Crank, Aidan
Singh, Digvijay
Kanter, Michael
Faraday, Nauder
Kharrazi, Hadi
author_sort Afshar, Ali S
collection PubMed
description Physiological data, such as heart rate and blood pressure, are critical to clinical decision-making in the intensive care unit (ICU). Vital signs data, which are available from electronic health records, can be used to diagnose and predict important clinical outcomes; While there have been some reports on the data quality of nurse-verified vital sign data, little has been reported on the data quality of higher frequency time-series vital signs acquired in ICUs, that would enable such predictive modeling. In this study, we assessed the data quality issues, defined as the completeness, accuracy, and timeliness, of minute-by-minute time series vital signs data within the MIMIC-III data set, captured from 16009 patient-ICU stays and corresponding to 9410 unique adult patients. We measured data quality of four time-series vital signs data streams in the MIMIC-III data set: heart rate (HR), respiratory rate (RR), blood oxygen saturation (SpO2), and arterial blood pressure (ABP). Approximately, 30% of patient-ICU stays did not have at least 1 min of data during the time-frame of the ICU stay for HR, RR, and SpO2. The percentage of patient-ICU stays that did not have at least 1 min of ABP data was ∼56%. We observed ∼80% coverage of the total duration of the ICU stay for HR, RR, and SpO2. Finally, only 12.5%%, 9.9%, 7.5%, and 4.4% of ICU lengths of stay had ≥ 99% data available for HR, RR, SpO2, and ABP, respectively, that would meet the three data quality requirements we looked into in this study. Our findings on data completeness, accuracy, and timeliness have important implications for data scientists and informatics researchers who use time series vital signs data to develop predictive models of ICU outcomes.
format Online
Article
Text
id pubmed-8327372
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-83273722021-08-03 An exploratory data quality analysis of time series physiologic signals using a large-scale intensive care unit database Afshar, Ali S Li, Yijun Chen, Zixu Chen, Yuxuan Lee, Jae Hun Irani, Darius Crank, Aidan Singh, Digvijay Kanter, Michael Faraday, Nauder Kharrazi, Hadi JAMIA Open Brief Communications Physiological data, such as heart rate and blood pressure, are critical to clinical decision-making in the intensive care unit (ICU). Vital signs data, which are available from electronic health records, can be used to diagnose and predict important clinical outcomes; While there have been some reports on the data quality of nurse-verified vital sign data, little has been reported on the data quality of higher frequency time-series vital signs acquired in ICUs, that would enable such predictive modeling. In this study, we assessed the data quality issues, defined as the completeness, accuracy, and timeliness, of minute-by-minute time series vital signs data within the MIMIC-III data set, captured from 16009 patient-ICU stays and corresponding to 9410 unique adult patients. We measured data quality of four time-series vital signs data streams in the MIMIC-III data set: heart rate (HR), respiratory rate (RR), blood oxygen saturation (SpO2), and arterial blood pressure (ABP). Approximately, 30% of patient-ICU stays did not have at least 1 min of data during the time-frame of the ICU stay for HR, RR, and SpO2. The percentage of patient-ICU stays that did not have at least 1 min of ABP data was ∼56%. We observed ∼80% coverage of the total duration of the ICU stay for HR, RR, and SpO2. Finally, only 12.5%%, 9.9%, 7.5%, and 4.4% of ICU lengths of stay had ≥ 99% data available for HR, RR, SpO2, and ABP, respectively, that would meet the three data quality requirements we looked into in this study. Our findings on data completeness, accuracy, and timeliness have important implications for data scientists and informatics researchers who use time series vital signs data to develop predictive models of ICU outcomes. Oxford University Press 2021-08-02 /pmc/articles/PMC8327372/ /pubmed/34350392 http://dx.doi.org/10.1093/jamiaopen/ooab057 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Brief Communications
Afshar, Ali S
Li, Yijun
Chen, Zixu
Chen, Yuxuan
Lee, Jae Hun
Irani, Darius
Crank, Aidan
Singh, Digvijay
Kanter, Michael
Faraday, Nauder
Kharrazi, Hadi
An exploratory data quality analysis of time series physiologic signals using a large-scale intensive care unit database
title An exploratory data quality analysis of time series physiologic signals using a large-scale intensive care unit database
title_full An exploratory data quality analysis of time series physiologic signals using a large-scale intensive care unit database
title_fullStr An exploratory data quality analysis of time series physiologic signals using a large-scale intensive care unit database
title_full_unstemmed An exploratory data quality analysis of time series physiologic signals using a large-scale intensive care unit database
title_short An exploratory data quality analysis of time series physiologic signals using a large-scale intensive care unit database
title_sort exploratory data quality analysis of time series physiologic signals using a large-scale intensive care unit database
topic Brief Communications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8327372/
https://www.ncbi.nlm.nih.gov/pubmed/34350392
http://dx.doi.org/10.1093/jamiaopen/ooab057
work_keys_str_mv AT afsharalis anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT liyijun anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT chenzixu anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT chenyuxuan anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT leejaehun anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT iranidarius anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT crankaidan anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT singhdigvijay anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT kantermichael anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT faradaynauder anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT kharrazihadi anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT afsharalis exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT liyijun exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT chenzixu exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT chenyuxuan exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT leejaehun exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT iranidarius exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT crankaidan exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT singhdigvijay exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT kantermichael exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT faradaynauder exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase
AT kharrazihadi exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase