Cargando…

External Validation of an Algorithm to Identify Patients with High Data-Completeness in Electronic Health Records for Comparative Effectiveness Research

OBJECTIVE: Electronic health records (EHR) data-discontinuity, i.e. receiving care outside of a particular EHR system, may cause misclassification of study variables. We aimed to validate an algorithm to identify patients with high EHR data-continuity to reduce such bias. MATERIALS AND METHODS: We a...

Descripción completa

Detalles Bibliográficos
Autores principales: Lin, Kueiyu Joshua, Rosenthal, Gary E, Murphy, Shawn N, Mandl, Kenneth D, Jin, Yinzhu, Glynn, Robert J, Schneeweiss, Sebastian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Dove 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7007793/
https://www.ncbi.nlm.nih.gov/pubmed/32099479
http://dx.doi.org/10.2147/CLEP.S232540
_version_ 1783495372257099776
author Lin, Kueiyu Joshua
Rosenthal, Gary E
Murphy, Shawn N
Mandl, Kenneth D
Jin, Yinzhu
Glynn, Robert J
Schneeweiss, Sebastian
author_facet Lin, Kueiyu Joshua
Rosenthal, Gary E
Murphy, Shawn N
Mandl, Kenneth D
Jin, Yinzhu
Glynn, Robert J
Schneeweiss, Sebastian
author_sort Lin, Kueiyu Joshua
collection PubMed
description OBJECTIVE: Electronic health records (EHR) data-discontinuity, i.e. receiving care outside of a particular EHR system, may cause misclassification of study variables. We aimed to validate an algorithm to identify patients with high EHR data-continuity to reduce such bias. MATERIALS AND METHODS: We analyzed data from two EHR systems linked with Medicare claims data from 2007 through 2014, one in Massachusetts (MA, n=80,588) and the other in North Carolina (NC, n=33,207). We quantified EHR data-continuity by Mean Proportion of Encounters Captured (MPEC) by the EHR system when compared to complete recording in claims data. The prediction model for MPEC was developed in MA and validated in NC. Stratified by predicted EHR data-continuity, we quantified misclassification of 40 key variables by Mean Standardized Differences (MSD) between the proportions of these variables based on EHR alone vs the linked claims-EHR data. RESULTS: The mean MPEC was 27% in the MA and 26% in the NC system. The predicted and observed EHR data-continuity was highly correlated (Spearman correlation=0.78 and 0.73, respectively). The misclassification (MSD) of 40 variables in patients of the predicted EHR data-continuity cohort was significantly smaller (44%, 95% CI: 40–48%) than that in the remaining population. DISCUSSION: The comorbidity profiles were similar in patients with high vs low EHR data-continuity. Therefore, restricting an analysis to patients with high EHR data-continuity may reduce information bias while preserving the representativeness of the study cohort. CONCLUSION: We have successfully validated an algorithm that can identify a high EHR data-continuity cohort representative of the source population.
format Online
Article
Text
id pubmed-7007793
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Dove
record_format MEDLINE/PubMed
spelling pubmed-70077932020-02-25 External Validation of an Algorithm to Identify Patients with High Data-Completeness in Electronic Health Records for Comparative Effectiveness Research Lin, Kueiyu Joshua Rosenthal, Gary E Murphy, Shawn N Mandl, Kenneth D Jin, Yinzhu Glynn, Robert J Schneeweiss, Sebastian Clin Epidemiol Original Research OBJECTIVE: Electronic health records (EHR) data-discontinuity, i.e. receiving care outside of a particular EHR system, may cause misclassification of study variables. We aimed to validate an algorithm to identify patients with high EHR data-continuity to reduce such bias. MATERIALS AND METHODS: We analyzed data from two EHR systems linked with Medicare claims data from 2007 through 2014, one in Massachusetts (MA, n=80,588) and the other in North Carolina (NC, n=33,207). We quantified EHR data-continuity by Mean Proportion of Encounters Captured (MPEC) by the EHR system when compared to complete recording in claims data. The prediction model for MPEC was developed in MA and validated in NC. Stratified by predicted EHR data-continuity, we quantified misclassification of 40 key variables by Mean Standardized Differences (MSD) between the proportions of these variables based on EHR alone vs the linked claims-EHR data. RESULTS: The mean MPEC was 27% in the MA and 26% in the NC system. The predicted and observed EHR data-continuity was highly correlated (Spearman correlation=0.78 and 0.73, respectively). The misclassification (MSD) of 40 variables in patients of the predicted EHR data-continuity cohort was significantly smaller (44%, 95% CI: 40–48%) than that in the remaining population. DISCUSSION: The comorbidity profiles were similar in patients with high vs low EHR data-continuity. Therefore, restricting an analysis to patients with high EHR data-continuity may reduce information bias while preserving the representativeness of the study cohort. CONCLUSION: We have successfully validated an algorithm that can identify a high EHR data-continuity cohort representative of the source population. Dove 2020-02-04 /pmc/articles/PMC7007793/ /pubmed/32099479 http://dx.doi.org/10.2147/CLEP.S232540 Text en © 2020 Lin et al. http://creativecommons.org/licenses/by-nc/3.0/ This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution – Non Commercial (unported, v3.0) License (http://creativecommons.org/licenses/by-nc/3.0/). By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms (https://www.dovepress.com/terms.php).
spellingShingle Original Research
Lin, Kueiyu Joshua
Rosenthal, Gary E
Murphy, Shawn N
Mandl, Kenneth D
Jin, Yinzhu
Glynn, Robert J
Schneeweiss, Sebastian
External Validation of an Algorithm to Identify Patients with High Data-Completeness in Electronic Health Records for Comparative Effectiveness Research
title External Validation of an Algorithm to Identify Patients with High Data-Completeness in Electronic Health Records for Comparative Effectiveness Research
title_full External Validation of an Algorithm to Identify Patients with High Data-Completeness in Electronic Health Records for Comparative Effectiveness Research
title_fullStr External Validation of an Algorithm to Identify Patients with High Data-Completeness in Electronic Health Records for Comparative Effectiveness Research
title_full_unstemmed External Validation of an Algorithm to Identify Patients with High Data-Completeness in Electronic Health Records for Comparative Effectiveness Research
title_short External Validation of an Algorithm to Identify Patients with High Data-Completeness in Electronic Health Records for Comparative Effectiveness Research
title_sort external validation of an algorithm to identify patients with high data-completeness in electronic health records for comparative effectiveness research
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7007793/
https://www.ncbi.nlm.nih.gov/pubmed/32099479
http://dx.doi.org/10.2147/CLEP.S232540
work_keys_str_mv AT linkueiyujoshua externalvalidationofanalgorithmtoidentifypatientswithhighdatacompletenessinelectronichealthrecordsforcomparativeeffectivenessresearch
AT rosenthalgarye externalvalidationofanalgorithmtoidentifypatientswithhighdatacompletenessinelectronichealthrecordsforcomparativeeffectivenessresearch
AT murphyshawnn externalvalidationofanalgorithmtoidentifypatientswithhighdatacompletenessinelectronichealthrecordsforcomparativeeffectivenessresearch
AT mandlkennethd externalvalidationofanalgorithmtoidentifypatientswithhighdatacompletenessinelectronichealthrecordsforcomparativeeffectivenessresearch
AT jinyinzhu externalvalidationofanalgorithmtoidentifypatientswithhighdatacompletenessinelectronichealthrecordsforcomparativeeffectivenessresearch
AT glynnrobertj externalvalidationofanalgorithmtoidentifypatientswithhighdatacompletenessinelectronichealthrecordsforcomparativeeffectivenessresearch
AT schneeweisssebastian externalvalidationofanalgorithmtoidentifypatientswithhighdatacompletenessinelectronichealthrecordsforcomparativeeffectivenessresearch