Cargando…

dxpr: an R package for generating analysis-ready data from electronic health records—diagnoses and procedures

BACKGROUND: Enriched electronic health records (EHRs) contain crucial information related to disease progression, and this information can help with decision-making in the health care field. Data analytics in health care is deemed as one of the essential processes that help accelerate the progress o...

Descripción completa

Detalles Bibliográficos
Autores principales: Tseng, Yi-Ju, Chiu, Hsiang-Ju, Chen, Chun Ju
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8176530/
https://www.ncbi.nlm.nih.gov/pubmed/34141876
http://dx.doi.org/10.7717/peerj-cs.520
_version_ 1783703273737289728
author Tseng, Yi-Ju
Chiu, Hsiang-Ju
Chen, Chun Ju
author_facet Tseng, Yi-Ju
Chiu, Hsiang-Ju
Chen, Chun Ju
author_sort Tseng, Yi-Ju
collection PubMed
description BACKGROUND: Enriched electronic health records (EHRs) contain crucial information related to disease progression, and this information can help with decision-making in the health care field. Data analytics in health care is deemed as one of the essential processes that help accelerate the progress of clinical research. However, processing and analyzing EHR data are common bottlenecks in health care data analytics. METHODS: The dxpr R package provides mechanisms for integration, wrangling, and visualization of clinical data, including diagnosis and procedure records. First, the dxpr package helps users transform International Classification of Diseases (ICD) codes to a uniform format. After code format transformation, the dxpr package supports four strategies for grouping clinical diagnostic data. For clinical procedure data, two grouping methods can be chosen. After EHRs are integrated, users can employ a set of flexible built-in querying functions for dividing data into case and control groups by using specified criteria and splitting the data into before and after an event based on the record date. Subsequently, the structure of integrated long data can be converted into wide, analysis-ready data that are suitable for statistical analysis and visualization. RESULTS: We conducted comorbidity data processes based on a cohort of newborns from Medical Information Mart for Intensive Care-III (n = 7,833) by using the dxpr package. We first defined patent ductus arteriosus (PDA) cases as patients who had at least one PDA diagnosis (ICD, Ninth Revision, Clinical Modification [ICD-9-CM] 7470*). Controls were defined as patients who never had PDA diagnosis. In total, 381 and 7,452 patients with and without PDA, respectively, were included in our study population. Then, we grouped the diagnoses into defined comorbidities. Finally, we observed a statistically significant difference in 8 of the 16 comorbidities among patients with and without PDA, including fluid and electrolyte disorders, valvular disease, and others. CONCLUSIONS: This dxpr package helps clinical data analysts address the common bottleneck caused by clinical data characteristics such as heterogeneity and sparseness.
format Online
Article
Text
id pubmed-8176530
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-81765302021-06-16 dxpr: an R package for generating analysis-ready data from electronic health records—diagnoses and procedures Tseng, Yi-Ju Chiu, Hsiang-Ju Chen, Chun Ju PeerJ Comput Sci Bioinformatics BACKGROUND: Enriched electronic health records (EHRs) contain crucial information related to disease progression, and this information can help with decision-making in the health care field. Data analytics in health care is deemed as one of the essential processes that help accelerate the progress of clinical research. However, processing and analyzing EHR data are common bottlenecks in health care data analytics. METHODS: The dxpr R package provides mechanisms for integration, wrangling, and visualization of clinical data, including diagnosis and procedure records. First, the dxpr package helps users transform International Classification of Diseases (ICD) codes to a uniform format. After code format transformation, the dxpr package supports four strategies for grouping clinical diagnostic data. For clinical procedure data, two grouping methods can be chosen. After EHRs are integrated, users can employ a set of flexible built-in querying functions for dividing data into case and control groups by using specified criteria and splitting the data into before and after an event based on the record date. Subsequently, the structure of integrated long data can be converted into wide, analysis-ready data that are suitable for statistical analysis and visualization. RESULTS: We conducted comorbidity data processes based on a cohort of newborns from Medical Information Mart for Intensive Care-III (n = 7,833) by using the dxpr package. We first defined patent ductus arteriosus (PDA) cases as patients who had at least one PDA diagnosis (ICD, Ninth Revision, Clinical Modification [ICD-9-CM] 7470*). Controls were defined as patients who never had PDA diagnosis. In total, 381 and 7,452 patients with and without PDA, respectively, were included in our study population. Then, we grouped the diagnoses into defined comorbidities. Finally, we observed a statistically significant difference in 8 of the 16 comorbidities among patients with and without PDA, including fluid and electrolyte disorders, valvular disease, and others. CONCLUSIONS: This dxpr package helps clinical data analysts address the common bottleneck caused by clinical data characteristics such as heterogeneity and sparseness. PeerJ Inc. 2021-05-26 /pmc/articles/PMC8176530/ /pubmed/34141876 http://dx.doi.org/10.7717/peerj-cs.520 Text en ©2021 Tseng et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Tseng, Yi-Ju
Chiu, Hsiang-Ju
Chen, Chun Ju
dxpr: an R package for generating analysis-ready data from electronic health records—diagnoses and procedures
title dxpr: an R package for generating analysis-ready data from electronic health records—diagnoses and procedures
title_full dxpr: an R package for generating analysis-ready data from electronic health records—diagnoses and procedures
title_fullStr dxpr: an R package for generating analysis-ready data from electronic health records—diagnoses and procedures
title_full_unstemmed dxpr: an R package for generating analysis-ready data from electronic health records—diagnoses and procedures
title_short dxpr: an R package for generating analysis-ready data from electronic health records—diagnoses and procedures
title_sort dxpr: an r package for generating analysis-ready data from electronic health records—diagnoses and procedures
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8176530/
https://www.ncbi.nlm.nih.gov/pubmed/34141876
http://dx.doi.org/10.7717/peerj-cs.520
work_keys_str_mv AT tsengyiju dxpranrpackageforgeneratinganalysisreadydatafromelectronichealthrecordsdiagnosesandprocedures
AT chiuhsiangju dxpranrpackageforgeneratinganalysisreadydatafromelectronichealthrecordsdiagnosesandprocedures
AT chenchunju dxpranrpackageforgeneratinganalysisreadydatafromelectronichealthrecordsdiagnosesandprocedures