Cargando…

Development and validation of algorithms to build an electronic health record based cohort of patients with systemic sclerosis

OBJECTIVES: To evaluate methods of identifying patients with systemic sclerosis (SSc) using International Classification of Diseases, Tenth Revision (ICD-10) codes (M34*), electronic health record (EHR) databases and organ involvement keywords, that result in a validated cohort comprised of true cas...

Descripción completa

Detalles Bibliográficos
Autores principales: Tukpah, Ann-Marcia C., Rose, Jonathan A., Seger, Diane L., Dellaripa, Paul F., Hunninghake, Gary M., Bates, David W.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10101630/
https://www.ncbi.nlm.nih.gov/pubmed/37053291
http://dx.doi.org/10.1371/journal.pone.0283775
_version_ 1785025549508280320
author Tukpah, Ann-Marcia C.
Rose, Jonathan A.
Seger, Diane L.
Dellaripa, Paul F.
Hunninghake, Gary M.
Bates, David W.
author_facet Tukpah, Ann-Marcia C.
Rose, Jonathan A.
Seger, Diane L.
Dellaripa, Paul F.
Hunninghake, Gary M.
Bates, David W.
author_sort Tukpah, Ann-Marcia C.
collection PubMed
description OBJECTIVES: To evaluate methods of identifying patients with systemic sclerosis (SSc) using International Classification of Diseases, Tenth Revision (ICD-10) codes (M34*), electronic health record (EHR) databases and organ involvement keywords, that result in a validated cohort comprised of true cases with high disease burden. METHODS: We retrospectively studied patients in a healthcare system likely to have SSc. Using structured EHR data from January 2016 to June 2021, we identified 955 adult patients with M34* documented 2 or more times during the study period. A random subset of 100 patients was selected to validate the ICD-10 code for its positive predictive value (PPV). The dataset was then divided into a training and validation sets for unstructured text processing (UTP) search algorithms, two of which were created using keywords for Raynaud’s syndrome, and esophageal involvement/symptoms. RESULTS: Among 955 patients, the average age was 60. Most patients (84%) were female; 75% of patients were White, and 5.2% were Black. There were approximately 175 patients per year with the code newly documented, overall 24% had an ICD-10 code for esophageal disease, and 13.4% for pulmonary hypertension. The baseline PPV was 78%, which improved to 84% with UTP, identifying 788 patients likely to have SSc. After the ICD-10 code was placed, 63% of patients had a rheumatology office visit. Patients identified by the UTP search algorithm were more likely to have increased healthcare utilization (ICD-10 codes 4 or more times 84.1% vs 61.7%, p < .001), organ involvement (pulmonary hypertension 12.7% vs 6% p = .011) and medication use (mycophenolate use 28.7% vs 11.4%, p < .001) than those identified by the ICD codes alone. CONCLUSION: EHRs can be used to identify patients with SSc. Using unstructured text processing keyword searches for SSc clinical manifestations improved the PPV of ICD-10 codes alone and identified a group of patients most likely to have SSc and increased healthcare needs.
format Online
Article
Text
id pubmed-10101630
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-101016302023-04-14 Development and validation of algorithms to build an electronic health record based cohort of patients with systemic sclerosis Tukpah, Ann-Marcia C. Rose, Jonathan A. Seger, Diane L. Dellaripa, Paul F. Hunninghake, Gary M. Bates, David W. PLoS One Research Article OBJECTIVES: To evaluate methods of identifying patients with systemic sclerosis (SSc) using International Classification of Diseases, Tenth Revision (ICD-10) codes (M34*), electronic health record (EHR) databases and organ involvement keywords, that result in a validated cohort comprised of true cases with high disease burden. METHODS: We retrospectively studied patients in a healthcare system likely to have SSc. Using structured EHR data from January 2016 to June 2021, we identified 955 adult patients with M34* documented 2 or more times during the study period. A random subset of 100 patients was selected to validate the ICD-10 code for its positive predictive value (PPV). The dataset was then divided into a training and validation sets for unstructured text processing (UTP) search algorithms, two of which were created using keywords for Raynaud’s syndrome, and esophageal involvement/symptoms. RESULTS: Among 955 patients, the average age was 60. Most patients (84%) were female; 75% of patients were White, and 5.2% were Black. There were approximately 175 patients per year with the code newly documented, overall 24% had an ICD-10 code for esophageal disease, and 13.4% for pulmonary hypertension. The baseline PPV was 78%, which improved to 84% with UTP, identifying 788 patients likely to have SSc. After the ICD-10 code was placed, 63% of patients had a rheumatology office visit. Patients identified by the UTP search algorithm were more likely to have increased healthcare utilization (ICD-10 codes 4 or more times 84.1% vs 61.7%, p < .001), organ involvement (pulmonary hypertension 12.7% vs 6% p = .011) and medication use (mycophenolate use 28.7% vs 11.4%, p < .001) than those identified by the ICD codes alone. CONCLUSION: EHRs can be used to identify patients with SSc. Using unstructured text processing keyword searches for SSc clinical manifestations improved the PPV of ICD-10 codes alone and identified a group of patients most likely to have SSc and increased healthcare needs. Public Library of Science 2023-04-13 /pmc/articles/PMC10101630/ /pubmed/37053291 http://dx.doi.org/10.1371/journal.pone.0283775 Text en © 2023 Tukpah et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Tukpah, Ann-Marcia C.
Rose, Jonathan A.
Seger, Diane L.
Dellaripa, Paul F.
Hunninghake, Gary M.
Bates, David W.
Development and validation of algorithms to build an electronic health record based cohort of patients with systemic sclerosis
title Development and validation of algorithms to build an electronic health record based cohort of patients with systemic sclerosis
title_full Development and validation of algorithms to build an electronic health record based cohort of patients with systemic sclerosis
title_fullStr Development and validation of algorithms to build an electronic health record based cohort of patients with systemic sclerosis
title_full_unstemmed Development and validation of algorithms to build an electronic health record based cohort of patients with systemic sclerosis
title_short Development and validation of algorithms to build an electronic health record based cohort of patients with systemic sclerosis
title_sort development and validation of algorithms to build an electronic health record based cohort of patients with systemic sclerosis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10101630/
https://www.ncbi.nlm.nih.gov/pubmed/37053291
http://dx.doi.org/10.1371/journal.pone.0283775
work_keys_str_mv AT tukpahannmarciac developmentandvalidationofalgorithmstobuildanelectronichealthrecordbasedcohortofpatientswithsystemicsclerosis
AT rosejonathana developmentandvalidationofalgorithmstobuildanelectronichealthrecordbasedcohortofpatientswithsystemicsclerosis
AT segerdianel developmentandvalidationofalgorithmstobuildanelectronichealthrecordbasedcohortofpatientswithsystemicsclerosis
AT dellaripapaulf developmentandvalidationofalgorithmstobuildanelectronichealthrecordbasedcohortofpatientswithsystemicsclerosis
AT hunninghakegarym developmentandvalidationofalgorithmstobuildanelectronichealthrecordbasedcohortofpatientswithsystemicsclerosis
AT batesdavidw developmentandvalidationofalgorithmstobuildanelectronichealthrecordbasedcohortofpatientswithsystemicsclerosis