Cargando…

Advancing an Algorithm for the Identification of Patients with High Data-Continuity in Electronic Health Records

BACKGROUND: Identifying high data-continuity patients in an electronic health record (EHR) system may facilitate selecting cohorts with a lower degree of variable misclassification and promote study validity. We updated a previously developed algorithm for identifying patients with high EHR data-com...

Descripción completa

Detalles Bibliográficos
Autores principales:	Merola, David, Schneeweiss, Sebastian, Jin, Yinzhu, Lii, Joyce, Lin, Kueiyu Joshua
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Dove 2022
Materias:	Original Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9653024/ https://www.ncbi.nlm.nih.gov/pubmed/36387928 http://dx.doi.org/10.2147/CLEP.S370031

_version_	1784828598224420864
author	Merola, David Schneeweiss, Sebastian Jin, Yinzhu Lii, Joyce Lin, Kueiyu Joshua
author_facet	Merola, David Schneeweiss, Sebastian Jin, Yinzhu Lii, Joyce Lin, Kueiyu Joshua
author_sort	Merola, David
collection	PubMed
description	BACKGROUND: Identifying high data-continuity patients in an electronic health record (EHR) system may facilitate selecting cohorts with a lower degree of variable misclassification and promote study validity. We updated a previously developed algorithm for identifying patients with high EHR data-completeness by adding demographic and health utilization factors to improve adaptability to networks serving patients of diverse backgrounds. We also expanded the algorithm to accommodate data in the ICD-10 era. METHODS: We used Medicare claims linked with EHR data to identify individuals aged ≥65 years. EHR-continuity was defined as the proportion of encounters captured in EHR data relative to claims. We compared the model with additional demographic factors and their interaction terms with other predictors with the original algorithm and assessed the performance by area under the ROC curve (AUC) and net reclassification index (NRI). RESULTS: The study cohort consisted of 264,099 subjects. The updated prediction model had an AUC of 0.93 in the validation set. Compared to the previous model, the new model had an NRI of 37.4% (p<0.001) for EHR-continuity classification. Interaction terms between demographic variables and other predictors did not improve the performance. Patients within the top 20% of predicted EHR-continuity had four times less misclassification of key variables compared to the remaining population. CONCLUSION: Adding demographic and healthcare utilization variables significantly improved the model performance. Patients with high predicted EHR-continuity had less misclassification of study variables compared to the remaining population in both ICD-9 and 10 eras.
format	Online Article Text
id	pubmed-9653024
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Dove
record_format	MEDLINE/PubMed
spelling	pubmed-96530242022-11-15 Advancing an Algorithm for the Identification of Patients with High Data-Continuity in Electronic Health Records Merola, David Schneeweiss, Sebastian Jin, Yinzhu Lii, Joyce Lin, Kueiyu Joshua Clin Epidemiol Original Research BACKGROUND: Identifying high data-continuity patients in an electronic health record (EHR) system may facilitate selecting cohorts with a lower degree of variable misclassification and promote study validity. We updated a previously developed algorithm for identifying patients with high EHR data-completeness by adding demographic and health utilization factors to improve adaptability to networks serving patients of diverse backgrounds. We also expanded the algorithm to accommodate data in the ICD-10 era. METHODS: We used Medicare claims linked with EHR data to identify individuals aged ≥65 years. EHR-continuity was defined as the proportion of encounters captured in EHR data relative to claims. We compared the model with additional demographic factors and their interaction terms with other predictors with the original algorithm and assessed the performance by area under the ROC curve (AUC) and net reclassification index (NRI). RESULTS: The study cohort consisted of 264,099 subjects. The updated prediction model had an AUC of 0.93 in the validation set. Compared to the previous model, the new model had an NRI of 37.4% (p<0.001) for EHR-continuity classification. Interaction terms between demographic variables and other predictors did not improve the performance. Patients within the top 20% of predicted EHR-continuity had four times less misclassification of key variables compared to the remaining population. CONCLUSION: Adding demographic and healthcare utilization variables significantly improved the model performance. Patients with high predicted EHR-continuity had less misclassification of study variables compared to the remaining population in both ICD-9 and 10 eras. Dove 2022-11-08 /pmc/articles/PMC9653024/ /pubmed/36387928 http://dx.doi.org/10.2147/CLEP.S370031 Text en © 2022 Merola et al. https://creativecommons.org/licenses/by-nc/3.0/This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution – Non Commercial (unported, v3.0) License (http://creativecommons.org/licenses/by-nc/3.0/ (https://creativecommons.org/licenses/by-nc/3.0/) ). By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms (https://www.dovepress.com/terms.php).
spellingShingle	Original Research Merola, David Schneeweiss, Sebastian Jin, Yinzhu Lii, Joyce Lin, Kueiyu Joshua Advancing an Algorithm for the Identification of Patients with High Data-Continuity in Electronic Health Records
title	Advancing an Algorithm for the Identification of Patients with High Data-Continuity in Electronic Health Records
title_full	Advancing an Algorithm for the Identification of Patients with High Data-Continuity in Electronic Health Records
title_fullStr	Advancing an Algorithm for the Identification of Patients with High Data-Continuity in Electronic Health Records
title_full_unstemmed	Advancing an Algorithm for the Identification of Patients with High Data-Continuity in Electronic Health Records
title_short	Advancing an Algorithm for the Identification of Patients with High Data-Continuity in Electronic Health Records
title_sort	advancing an algorithm for the identification of patients with high data-continuity in electronic health records
topic	Original Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9653024/ https://www.ncbi.nlm.nih.gov/pubmed/36387928 http://dx.doi.org/10.2147/CLEP.S370031
work_keys_str_mv	AT meroladavid advancinganalgorithmfortheidentificationofpatientswithhighdatacontinuityinelectronichealthrecords AT schneeweisssebastian advancinganalgorithmfortheidentificationofpatientswithhighdatacontinuityinelectronichealthrecords AT jinyinzhu advancinganalgorithmfortheidentificationofpatientswithhighdatacontinuityinelectronichealthrecords AT liijoyce advancinganalgorithmfortheidentificationofpatientswithhighdatacontinuityinelectronichealthrecords AT linkueiyujoshua advancinganalgorithmfortheidentificationofpatientswithhighdatacontinuityinelectronichealthrecords

Advancing an Algorithm for the Identification of Patients with High Data-Continuity in Electronic Health Records

Ejemplares similares