Cargando…
An exploratory data quality analysis of time series physiologic signals using a large-scale intensive care unit database
Physiological data, such as heart rate and blood pressure, are critical to clinical decision-making in the intensive care unit (ICU). Vital signs data, which are available from electronic health records, can be used to diagnose and predict important clinical outcomes; While there have been some repo...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8327372/ https://www.ncbi.nlm.nih.gov/pubmed/34350392 http://dx.doi.org/10.1093/jamiaopen/ooab057 |
_version_ | 1783732060997812224 |
---|---|
author | Afshar, Ali S Li, Yijun Chen, Zixu Chen, Yuxuan Lee, Jae Hun Irani, Darius Crank, Aidan Singh, Digvijay Kanter, Michael Faraday, Nauder Kharrazi, Hadi |
author_facet | Afshar, Ali S Li, Yijun Chen, Zixu Chen, Yuxuan Lee, Jae Hun Irani, Darius Crank, Aidan Singh, Digvijay Kanter, Michael Faraday, Nauder Kharrazi, Hadi |
author_sort | Afshar, Ali S |
collection | PubMed |
description | Physiological data, such as heart rate and blood pressure, are critical to clinical decision-making in the intensive care unit (ICU). Vital signs data, which are available from electronic health records, can be used to diagnose and predict important clinical outcomes; While there have been some reports on the data quality of nurse-verified vital sign data, little has been reported on the data quality of higher frequency time-series vital signs acquired in ICUs, that would enable such predictive modeling. In this study, we assessed the data quality issues, defined as the completeness, accuracy, and timeliness, of minute-by-minute time series vital signs data within the MIMIC-III data set, captured from 16009 patient-ICU stays and corresponding to 9410 unique adult patients. We measured data quality of four time-series vital signs data streams in the MIMIC-III data set: heart rate (HR), respiratory rate (RR), blood oxygen saturation (SpO2), and arterial blood pressure (ABP). Approximately, 30% of patient-ICU stays did not have at least 1 min of data during the time-frame of the ICU stay for HR, RR, and SpO2. The percentage of patient-ICU stays that did not have at least 1 min of ABP data was ∼56%. We observed ∼80% coverage of the total duration of the ICU stay for HR, RR, and SpO2. Finally, only 12.5%%, 9.9%, 7.5%, and 4.4% of ICU lengths of stay had ≥ 99% data available for HR, RR, SpO2, and ABP, respectively, that would meet the three data quality requirements we looked into in this study. Our findings on data completeness, accuracy, and timeliness have important implications for data scientists and informatics researchers who use time series vital signs data to develop predictive models of ICU outcomes. |
format | Online Article Text |
id | pubmed-8327372 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-83273722021-08-03 An exploratory data quality analysis of time series physiologic signals using a large-scale intensive care unit database Afshar, Ali S Li, Yijun Chen, Zixu Chen, Yuxuan Lee, Jae Hun Irani, Darius Crank, Aidan Singh, Digvijay Kanter, Michael Faraday, Nauder Kharrazi, Hadi JAMIA Open Brief Communications Physiological data, such as heart rate and blood pressure, are critical to clinical decision-making in the intensive care unit (ICU). Vital signs data, which are available from electronic health records, can be used to diagnose and predict important clinical outcomes; While there have been some reports on the data quality of nurse-verified vital sign data, little has been reported on the data quality of higher frequency time-series vital signs acquired in ICUs, that would enable such predictive modeling. In this study, we assessed the data quality issues, defined as the completeness, accuracy, and timeliness, of minute-by-minute time series vital signs data within the MIMIC-III data set, captured from 16009 patient-ICU stays and corresponding to 9410 unique adult patients. We measured data quality of four time-series vital signs data streams in the MIMIC-III data set: heart rate (HR), respiratory rate (RR), blood oxygen saturation (SpO2), and arterial blood pressure (ABP). Approximately, 30% of patient-ICU stays did not have at least 1 min of data during the time-frame of the ICU stay for HR, RR, and SpO2. The percentage of patient-ICU stays that did not have at least 1 min of ABP data was ∼56%. We observed ∼80% coverage of the total duration of the ICU stay for HR, RR, and SpO2. Finally, only 12.5%%, 9.9%, 7.5%, and 4.4% of ICU lengths of stay had ≥ 99% data available for HR, RR, SpO2, and ABP, respectively, that would meet the three data quality requirements we looked into in this study. Our findings on data completeness, accuracy, and timeliness have important implications for data scientists and informatics researchers who use time series vital signs data to develop predictive models of ICU outcomes. Oxford University Press 2021-08-02 /pmc/articles/PMC8327372/ /pubmed/34350392 http://dx.doi.org/10.1093/jamiaopen/ooab057 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Brief Communications Afshar, Ali S Li, Yijun Chen, Zixu Chen, Yuxuan Lee, Jae Hun Irani, Darius Crank, Aidan Singh, Digvijay Kanter, Michael Faraday, Nauder Kharrazi, Hadi An exploratory data quality analysis of time series physiologic signals using a large-scale intensive care unit database |
title | An exploratory data quality analysis of time series physiologic signals using a large-scale intensive care unit database |
title_full | An exploratory data quality analysis of time series physiologic signals using a large-scale intensive care unit database |
title_fullStr | An exploratory data quality analysis of time series physiologic signals using a large-scale intensive care unit database |
title_full_unstemmed | An exploratory data quality analysis of time series physiologic signals using a large-scale intensive care unit database |
title_short | An exploratory data quality analysis of time series physiologic signals using a large-scale intensive care unit database |
title_sort | exploratory data quality analysis of time series physiologic signals using a large-scale intensive care unit database |
topic | Brief Communications |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8327372/ https://www.ncbi.nlm.nih.gov/pubmed/34350392 http://dx.doi.org/10.1093/jamiaopen/ooab057 |
work_keys_str_mv | AT afsharalis anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT liyijun anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT chenzixu anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT chenyuxuan anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT leejaehun anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT iranidarius anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT crankaidan anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT singhdigvijay anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT kantermichael anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT faradaynauder anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT kharrazihadi anexploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT afsharalis exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT liyijun exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT chenzixu exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT chenyuxuan exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT leejaehun exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT iranidarius exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT crankaidan exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT singhdigvijay exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT kantermichael exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT faradaynauder exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase AT kharrazihadi exploratorydataqualityanalysisoftimeseriesphysiologicsignalsusingalargescaleintensivecareunitdatabase |