Cargando…
A broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation
OBJECTIVE: Patients who receive most care within a single healthcare system (colloquially called a “loyalty cohort” since they typically return to the same providers) have mostly complete data within that organization’s electronic health record (EHR). Loyalty cohorts have low data missingness, which...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654861/ https://www.ncbi.nlm.nih.gov/pubmed/37632234 http://dx.doi.org/10.1093/jamia/ocad166 |
_version_ | 1785136707798040576 |
---|---|
author | Klann, Jeffrey G Henderson, Darren W Morris, Michele Estiri, Hossein Weber, Griffin M Visweswaran, Shyam Murphy, Shawn N |
author_facet | Klann, Jeffrey G Henderson, Darren W Morris, Michele Estiri, Hossein Weber, Griffin M Visweswaran, Shyam Murphy, Shawn N |
author_sort | Klann, Jeffrey G |
collection | PubMed |
description | OBJECTIVE: Patients who receive most care within a single healthcare system (colloquially called a “loyalty cohort” since they typically return to the same providers) have mostly complete data within that organization’s electronic health record (EHR). Loyalty cohorts have low data missingness, which can unintentionally bias research results. Using proxies of routine care and healthcare utilization metrics, we compute a per-patient score that identifies a loyalty cohort. MATERIALS AND METHODS: We implemented a computable program for the widely adopted i2b2 platform that identifies loyalty cohorts in EHRs based on a machine-learning model, which was previously validated using linked claims data. We developed a novel validation approach, which tests, using only EHR data, whether patients returned to the same healthcare system after the training period. We evaluated these tools at 3 institutions using data from 2017 to 2019. RESULTS: Loyalty cohort calculations to identify patients who returned during a 1-year follow-up yielded a mean area under the receiver operating characteristic curve of 0.77 using the original model and 0.80 after calibrating the model at individual sites. Factors such as multiple medications or visits contributed significantly at all sites. Screening tests’ contributions (eg, colonoscopy) varied across sites, likely due to coding and population differences. DISCUSSION: This open-source implementation of a “loyalty score” algorithm had good predictive power. Enriching research cohorts by utilizing these low-missingness patients is a way to obtain the data completeness necessary for accurate causal analysis. CONCLUSION: i2b2 sites can use this approach to select cohorts with mostly complete EHR data. |
format | Online Article Text |
id | pubmed-10654861 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-106548612023-08-25 A broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation Klann, Jeffrey G Henderson, Darren W Morris, Michele Estiri, Hossein Weber, Griffin M Visweswaran, Shyam Murphy, Shawn N J Am Med Inform Assoc Research and Applications OBJECTIVE: Patients who receive most care within a single healthcare system (colloquially called a “loyalty cohort” since they typically return to the same providers) have mostly complete data within that organization’s electronic health record (EHR). Loyalty cohorts have low data missingness, which can unintentionally bias research results. Using proxies of routine care and healthcare utilization metrics, we compute a per-patient score that identifies a loyalty cohort. MATERIALS AND METHODS: We implemented a computable program for the widely adopted i2b2 platform that identifies loyalty cohorts in EHRs based on a machine-learning model, which was previously validated using linked claims data. We developed a novel validation approach, which tests, using only EHR data, whether patients returned to the same healthcare system after the training period. We evaluated these tools at 3 institutions using data from 2017 to 2019. RESULTS: Loyalty cohort calculations to identify patients who returned during a 1-year follow-up yielded a mean area under the receiver operating characteristic curve of 0.77 using the original model and 0.80 after calibrating the model at individual sites. Factors such as multiple medications or visits contributed significantly at all sites. Screening tests’ contributions (eg, colonoscopy) varied across sites, likely due to coding and population differences. DISCUSSION: This open-source implementation of a “loyalty score” algorithm had good predictive power. Enriching research cohorts by utilizing these low-missingness patients is a way to obtain the data completeness necessary for accurate causal analysis. CONCLUSION: i2b2 sites can use this approach to select cohorts with mostly complete EHR data. Oxford University Press 2023-08-25 /pmc/articles/PMC10654861/ /pubmed/37632234 http://dx.doi.org/10.1093/jamia/ocad166 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Research and Applications Klann, Jeffrey G Henderson, Darren W Morris, Michele Estiri, Hossein Weber, Griffin M Visweswaran, Shyam Murphy, Shawn N A broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation |
title | A broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation |
title_full | A broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation |
title_fullStr | A broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation |
title_full_unstemmed | A broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation |
title_short | A broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation |
title_sort | broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation |
topic | Research and Applications |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654861/ https://www.ncbi.nlm.nih.gov/pubmed/37632234 http://dx.doi.org/10.1093/jamia/ocad166 |
work_keys_str_mv | AT klannjeffreyg abroadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation AT hendersondarrenw abroadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation AT morrismichele abroadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation AT estirihossein abroadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation AT webergriffinm abroadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation AT visweswaranshyam abroadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation AT murphyshawnn abroadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation AT klannjeffreyg broadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation AT hendersondarrenw broadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation AT morrismichele broadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation AT estirihossein broadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation AT webergriffinm broadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation AT visweswaranshyam broadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation AT murphyshawnn broadlyapplicableapproachtoenrichelectronichealthrecordcohortsbyidentifyingpatientswithcompletedataamultisiteevaluation |