Cargando…
Development and validation of algorithms to build an electronic health record based cohort of patients with systemic sclerosis
OBJECTIVES: To evaluate methods of identifying patients with systemic sclerosis (SSc) using International Classification of Diseases, Tenth Revision (ICD-10) codes (M34*), electronic health record (EHR) databases and organ involvement keywords, that result in a validated cohort comprised of true cas...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10101630/ https://www.ncbi.nlm.nih.gov/pubmed/37053291 http://dx.doi.org/10.1371/journal.pone.0283775 |
_version_ | 1785025549508280320 |
---|---|
author | Tukpah, Ann-Marcia C. Rose, Jonathan A. Seger, Diane L. Dellaripa, Paul F. Hunninghake, Gary M. Bates, David W. |
author_facet | Tukpah, Ann-Marcia C. Rose, Jonathan A. Seger, Diane L. Dellaripa, Paul F. Hunninghake, Gary M. Bates, David W. |
author_sort | Tukpah, Ann-Marcia C. |
collection | PubMed |
description | OBJECTIVES: To evaluate methods of identifying patients with systemic sclerosis (SSc) using International Classification of Diseases, Tenth Revision (ICD-10) codes (M34*), electronic health record (EHR) databases and organ involvement keywords, that result in a validated cohort comprised of true cases with high disease burden. METHODS: We retrospectively studied patients in a healthcare system likely to have SSc. Using structured EHR data from January 2016 to June 2021, we identified 955 adult patients with M34* documented 2 or more times during the study period. A random subset of 100 patients was selected to validate the ICD-10 code for its positive predictive value (PPV). The dataset was then divided into a training and validation sets for unstructured text processing (UTP) search algorithms, two of which were created using keywords for Raynaud’s syndrome, and esophageal involvement/symptoms. RESULTS: Among 955 patients, the average age was 60. Most patients (84%) were female; 75% of patients were White, and 5.2% were Black. There were approximately 175 patients per year with the code newly documented, overall 24% had an ICD-10 code for esophageal disease, and 13.4% for pulmonary hypertension. The baseline PPV was 78%, which improved to 84% with UTP, identifying 788 patients likely to have SSc. After the ICD-10 code was placed, 63% of patients had a rheumatology office visit. Patients identified by the UTP search algorithm were more likely to have increased healthcare utilization (ICD-10 codes 4 or more times 84.1% vs 61.7%, p < .001), organ involvement (pulmonary hypertension 12.7% vs 6% p = .011) and medication use (mycophenolate use 28.7% vs 11.4%, p < .001) than those identified by the ICD codes alone. CONCLUSION: EHRs can be used to identify patients with SSc. Using unstructured text processing keyword searches for SSc clinical manifestations improved the PPV of ICD-10 codes alone and identified a group of patients most likely to have SSc and increased healthcare needs. |
format | Online Article Text |
id | pubmed-10101630 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-101016302023-04-14 Development and validation of algorithms to build an electronic health record based cohort of patients with systemic sclerosis Tukpah, Ann-Marcia C. Rose, Jonathan A. Seger, Diane L. Dellaripa, Paul F. Hunninghake, Gary M. Bates, David W. PLoS One Research Article OBJECTIVES: To evaluate methods of identifying patients with systemic sclerosis (SSc) using International Classification of Diseases, Tenth Revision (ICD-10) codes (M34*), electronic health record (EHR) databases and organ involvement keywords, that result in a validated cohort comprised of true cases with high disease burden. METHODS: We retrospectively studied patients in a healthcare system likely to have SSc. Using structured EHR data from January 2016 to June 2021, we identified 955 adult patients with M34* documented 2 or more times during the study period. A random subset of 100 patients was selected to validate the ICD-10 code for its positive predictive value (PPV). The dataset was then divided into a training and validation sets for unstructured text processing (UTP) search algorithms, two of which were created using keywords for Raynaud’s syndrome, and esophageal involvement/symptoms. RESULTS: Among 955 patients, the average age was 60. Most patients (84%) were female; 75% of patients were White, and 5.2% were Black. There were approximately 175 patients per year with the code newly documented, overall 24% had an ICD-10 code for esophageal disease, and 13.4% for pulmonary hypertension. The baseline PPV was 78%, which improved to 84% with UTP, identifying 788 patients likely to have SSc. After the ICD-10 code was placed, 63% of patients had a rheumatology office visit. Patients identified by the UTP search algorithm were more likely to have increased healthcare utilization (ICD-10 codes 4 or more times 84.1% vs 61.7%, p < .001), organ involvement (pulmonary hypertension 12.7% vs 6% p = .011) and medication use (mycophenolate use 28.7% vs 11.4%, p < .001) than those identified by the ICD codes alone. CONCLUSION: EHRs can be used to identify patients with SSc. Using unstructured text processing keyword searches for SSc clinical manifestations improved the PPV of ICD-10 codes alone and identified a group of patients most likely to have SSc and increased healthcare needs. Public Library of Science 2023-04-13 /pmc/articles/PMC10101630/ /pubmed/37053291 http://dx.doi.org/10.1371/journal.pone.0283775 Text en © 2023 Tukpah et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Tukpah, Ann-Marcia C. Rose, Jonathan A. Seger, Diane L. Dellaripa, Paul F. Hunninghake, Gary M. Bates, David W. Development and validation of algorithms to build an electronic health record based cohort of patients with systemic sclerosis |
title | Development and validation of algorithms to build an electronic health record based cohort of patients with systemic sclerosis |
title_full | Development and validation of algorithms to build an electronic health record based cohort of patients with systemic sclerosis |
title_fullStr | Development and validation of algorithms to build an electronic health record based cohort of patients with systemic sclerosis |
title_full_unstemmed | Development and validation of algorithms to build an electronic health record based cohort of patients with systemic sclerosis |
title_short | Development and validation of algorithms to build an electronic health record based cohort of patients with systemic sclerosis |
title_sort | development and validation of algorithms to build an electronic health record based cohort of patients with systemic sclerosis |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10101630/ https://www.ncbi.nlm.nih.gov/pubmed/37053291 http://dx.doi.org/10.1371/journal.pone.0283775 |
work_keys_str_mv | AT tukpahannmarciac developmentandvalidationofalgorithmstobuildanelectronichealthrecordbasedcohortofpatientswithsystemicsclerosis AT rosejonathana developmentandvalidationofalgorithmstobuildanelectronichealthrecordbasedcohortofpatientswithsystemicsclerosis AT segerdianel developmentandvalidationofalgorithmstobuildanelectronichealthrecordbasedcohortofpatientswithsystemicsclerosis AT dellaripapaulf developmentandvalidationofalgorithmstobuildanelectronichealthrecordbasedcohortofpatientswithsystemicsclerosis AT hunninghakegarym developmentandvalidationofalgorithmstobuildanelectronichealthrecordbasedcohortofpatientswithsystemicsclerosis AT batesdavidw developmentandvalidationofalgorithmstobuildanelectronichealthrecordbasedcohortofpatientswithsystemicsclerosis |