Cargando…

A broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation

OBJECTIVE: Patients who receive most care within a single healthcare system (colloquially called a “loyalty cohort” since they typically return to the same providers) have mostly complete data within that organization’s electronic health record (EHR). Loyalty cohorts have low data missingness, which...

Descripción completa

Detalles Bibliográficos
Autores principales: Klann, Jeffrey G, Henderson, Darren W, Morris, Michele, Estiri, Hossein, Weber, Griffin M, Visweswaran, Shyam, Murphy, Shawn N
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654861/
https://www.ncbi.nlm.nih.gov/pubmed/37632234
http://dx.doi.org/10.1093/jamia/ocad166
_version_ 1785136707798040576
author Klann, Jeffrey G
Henderson, Darren W
Morris, Michele
Estiri, Hossein
Weber, Griffin M
Visweswaran, Shyam
Murphy, Shawn N
author_facet Klann, Jeffrey G
Henderson, Darren W
Morris, Michele
Estiri, Hossein
Weber, Griffin M
Visweswaran, Shyam
Murphy, Shawn N
author_sort Klann, Jeffrey G
collection PubMed
description OBJECTIVE: Patients who receive most care within a single healthcare system (colloquially called a “loyalty cohort” since they typically return to the same providers) have mostly complete data within that organization’s electronic health record (EHR). Loyalty cohorts have low data missingness, which can unintentionally bias research results. Using proxies of routine care and healthcare utilization metrics, we compute a per-patient score that identifies a loyalty cohort. MATERIALS AND METHODS: We implemented a computable program for the widely adopted i2b2 platform that identifies loyalty cohorts in EHRs based on a machine-learning model, which was previously validated using linked claims data. We developed a novel validation approach, which tests, using only EHR data, whether patients returned to the same healthcare system after the training period. We evaluated these tools at 3 institutions using data from 2017 to 2019. RESULTS: Loyalty cohort calculations to identify patients who returned during a 1-year follow-up yielded a mean area under the receiver operating characteristic curve of 0.77 using the original model and 0.80 after calibrating the model at individual sites. Factors such as multiple medications or visits contributed significantly at all sites. Screening tests’ contributions (eg, colonoscopy) varied across sites, likely due to coding and population differences. DISCUSSION: This open-source implementation of a “loyalty score” algorithm had good predictive power. Enriching research cohorts by utilizing these low-missingness patients is a way to obtain the data completeness necessary for accurate causal analysis. CONCLUSION: i2b2 sites can use this approach to select cohorts with mostly complete EHR data.
format Online
Article
Text
id pubmed-10654861
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-106548612023-08-25 A broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation Klann, Jeffrey G Henderson, Darren W Morris, Michele Estiri, Hossein Weber, Griffin M Visweswaran, Shyam Murphy, Shawn N J Am Med Inform Assoc Research and Applications OBJECTIVE: Patients who receive most care within a single healthcare system (colloquially called a “loyalty cohort” since they typically return to the same providers) have mostly complete data within that organization’s electronic health record (EHR). Loyalty cohorts have low data missingness, which can unintentionally bias research results. Using proxies of routine care and healthcare utilization metrics, we compute a per-patient score that identifies a loyalty cohort. MATERIALS AND METHODS: We implemented a computable program for the widely adopted i2b2 platform that identifies loyalty cohorts in EHRs based on a machine-learning model, which was previously validated using linked claims data. We developed a novel validation approach, which tests, using only EHR data, whether patients returned to the same healthcare system after the training period. We evaluated these tools at 3 institutions using data from 2017 to 2019. RESULTS: Loyalty cohort calculations to identify patients who returned during a 1-year follow-up yielded a mean area under the receiver operating characteristic curve of 0.77 using the original model and 0.80 after calibrating the model at individual sites. Factors such as multiple medications or visits contributed significantly at all sites. Screening tests’ contributions (eg, colonoscopy) varied across sites, likely due to coding and population differences. DISCUSSION: This open-source implementation of a “loyalty score” algorithm had good predictive power. Enriching research cohorts by utilizing these low-missingness patients is a way to obtain the data completeness necessary for accurate causal analysis. CONCLUSION: i2b2 sites can use this approach to select cohorts with mostly complete EHR data. Oxford University Press 2023-08-25 /pmc/articles/PMC10654861/ /pubmed/37632234 http://dx.doi.org/10.1093/jamia/ocad166 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Research and Applications
Klann, Jeffrey G
Henderson, Darren W
Morris, Michele
Estiri, Hossein
Weber, Griffin M
Visweswaran, Shyam
Murphy, Shawn N
A broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation
title A broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation
title_full A broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation
title_fullStr A broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation
title_full_unstemmed A broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation
title_short A broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation
title_sort broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654861/
https://www.ncbi.nlm.nih.gov/pubmed/37632234
http://dx.doi.org/10.1093/jamia/ocad166
work_keys_str_mv AT klannjeffreyg abroadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation
AT hendersondarrenw abroadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation
AT morrismichele abroadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation
AT estirihossein abroadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation
AT webergriffinm abroadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation
AT visweswaranshyam abroadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation
AT murphyshawnn abroadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation
AT klannjeffreyg broadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation
AT hendersondarrenw broadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation
AT morrismichele broadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation
AT estirihossein broadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation
AT webergriffinm broadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation
AT visweswaranshyam broadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation
AT murphyshawnn broadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation