Cargando…

Combining structured and unstructured data in EMRs to create clinically-defined EMR-derived cohorts

BACKGROUND: There have been few studies describing how production EMR systems can be systematically queried to identify clinically-defined populations and limited studies utilising free-text in this process. The aim of this study is to provide a generalisable methodology for constructing clinically-...

Descripción completa

Detalles Bibliográficos
Autores principales: Tam, Charmaine S., Gullick, Janice, Saavedra, Aldo, Vernon, Stephen T., Figtree, Gemma A., Chow, Clara K., Cretikos, Michelle, Morris, Richard W., William, Maged, Morris, Jonathan, Brieger, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7938556/
https://www.ncbi.nlm.nih.gov/pubmed/33685456
http://dx.doi.org/10.1186/s12911-021-01441-w
_version_ 1783661615398256640
author Tam, Charmaine S.
Gullick, Janice
Saavedra, Aldo
Vernon, Stephen T.
Figtree, Gemma A.
Chow, Clara K.
Cretikos, Michelle
Morris, Richard W.
William, Maged
Morris, Jonathan
Brieger, David
author_facet Tam, Charmaine S.
Gullick, Janice
Saavedra, Aldo
Vernon, Stephen T.
Figtree, Gemma A.
Chow, Clara K.
Cretikos, Michelle
Morris, Richard W.
William, Maged
Morris, Jonathan
Brieger, David
author_sort Tam, Charmaine S.
collection PubMed
description BACKGROUND: There have been few studies describing how production EMR systems can be systematically queried to identify clinically-defined populations and limited studies utilising free-text in this process. The aim of this study is to provide a generalisable methodology for constructing clinically-defined EMR-derived patient cohorts using structured and unstructured data in EMRs. METHODS: Patients with possible acute coronary syndrome (ACS) were used as an exemplar. Cardiologists defined clinical criteria for patients presenting with possible ACS. These were mapped to data tables within the production EMR system creating seven inclusion criteria comprised of structured data fields (orders and investigations, procedures, scanned electrocardiogram (ECG) images, and diagnostic codes) and unstructured clinical documentation. Data were extracted from two local health districts (LHD) in Sydney, Australia. Outcome measures included examination of the relative contribution of individual inclusion criteria to the identification of eligible encounters, comparisons between inclusion criterion and evaluation of consistency of data extracts across years and LHDs. RESULTS: Among 802,742 encounters in a 5 year dataset (1/1/13–30/12/17), the presence of an ECG image (54.8% of encounters) and symptoms and keywords in clinical documentation (41.4–64.0%) were used most often to identify presentations of possible ACS. Orders and investigations (27.3%) and procedures (1.4%), were less often present for identified presentations. Relevant ICD-10/SNOMED CT codes were present for 3.7% of identified encounters. Similar trends were seen when the two LHDs were examined separately, and across years. CONCLUSIONS: Clinically-defined EMR-derived cohorts combining structured and unstructured data during cohort identification is a necessary prerequisite for critical validation work required for development of real-time clinical decision support and learning health systems.
format Online
Article
Text
id pubmed-7938556
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-79385562021-03-09 Combining structured and unstructured data in EMRs to create clinically-defined EMR-derived cohorts Tam, Charmaine S. Gullick, Janice Saavedra, Aldo Vernon, Stephen T. Figtree, Gemma A. Chow, Clara K. Cretikos, Michelle Morris, Richard W. William, Maged Morris, Jonathan Brieger, David BMC Med Inform Decis Mak Research Article BACKGROUND: There have been few studies describing how production EMR systems can be systematically queried to identify clinically-defined populations and limited studies utilising free-text in this process. The aim of this study is to provide a generalisable methodology for constructing clinically-defined EMR-derived patient cohorts using structured and unstructured data in EMRs. METHODS: Patients with possible acute coronary syndrome (ACS) were used as an exemplar. Cardiologists defined clinical criteria for patients presenting with possible ACS. These were mapped to data tables within the production EMR system creating seven inclusion criteria comprised of structured data fields (orders and investigations, procedures, scanned electrocardiogram (ECG) images, and diagnostic codes) and unstructured clinical documentation. Data were extracted from two local health districts (LHD) in Sydney, Australia. Outcome measures included examination of the relative contribution of individual inclusion criteria to the identification of eligible encounters, comparisons between inclusion criterion and evaluation of consistency of data extracts across years and LHDs. RESULTS: Among 802,742 encounters in a 5 year dataset (1/1/13–30/12/17), the presence of an ECG image (54.8% of encounters) and symptoms and keywords in clinical documentation (41.4–64.0%) were used most often to identify presentations of possible ACS. Orders and investigations (27.3%) and procedures (1.4%), were less often present for identified presentations. Relevant ICD-10/SNOMED CT codes were present for 3.7% of identified encounters. Similar trends were seen when the two LHDs were examined separately, and across years. CONCLUSIONS: Clinically-defined EMR-derived cohorts combining structured and unstructured data during cohort identification is a necessary prerequisite for critical validation work required for development of real-time clinical decision support and learning health systems. BioMed Central 2021-03-08 /pmc/articles/PMC7938556/ /pubmed/33685456 http://dx.doi.org/10.1186/s12911-021-01441-w Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Tam, Charmaine S.
Gullick, Janice
Saavedra, Aldo
Vernon, Stephen T.
Figtree, Gemma A.
Chow, Clara K.
Cretikos, Michelle
Morris, Richard W.
William, Maged
Morris, Jonathan
Brieger, David
Combining structured and unstructured data in EMRs to create clinically-defined EMR-derived cohorts
title Combining structured and unstructured data in EMRs to create clinically-defined EMR-derived cohorts
title_full Combining structured and unstructured data in EMRs to create clinically-defined EMR-derived cohorts
title_fullStr Combining structured and unstructured data in EMRs to create clinically-defined EMR-derived cohorts
title_full_unstemmed Combining structured and unstructured data in EMRs to create clinically-defined EMR-derived cohorts
title_short Combining structured and unstructured data in EMRs to create clinically-defined EMR-derived cohorts
title_sort combining structured and unstructured data in emrs to create clinically-defined emr-derived cohorts
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7938556/
https://www.ncbi.nlm.nih.gov/pubmed/33685456
http://dx.doi.org/10.1186/s12911-021-01441-w
work_keys_str_mv AT tamcharmaines combiningstructuredandunstructureddatainemrstocreateclinicallydefinedemrderivedcohorts
AT gullickjanice combiningstructuredandunstructureddatainemrstocreateclinicallydefinedemrderivedcohorts
AT saavedraaldo combiningstructuredandunstructureddatainemrstocreateclinicallydefinedemrderivedcohorts
AT vernonstephent combiningstructuredandunstructureddatainemrstocreateclinicallydefinedemrderivedcohorts
AT figtreegemmaa combiningstructuredandunstructureddatainemrstocreateclinicallydefinedemrderivedcohorts
AT chowclarak combiningstructuredandunstructureddatainemrstocreateclinicallydefinedemrderivedcohorts
AT cretikosmichelle combiningstructuredandunstructureddatainemrstocreateclinicallydefinedemrderivedcohorts
AT morrisrichardw combiningstructuredandunstructureddatainemrstocreateclinicallydefinedemrderivedcohorts
AT williammaged combiningstructuredandunstructureddatainemrstocreateclinicallydefinedemrderivedcohorts
AT morrisjonathan combiningstructuredandunstructureddatainemrstocreateclinicallydefinedemrderivedcohorts
AT briegerdavid combiningstructuredandunstructureddatainemrstocreateclinicallydefinedemrderivedcohorts