Cargando…
Validation of diagnosis codes to identify side of colon in an electronic health record registry
BACKGROUND: The use of real-world data to generate evidence requires careful assessment and validation of critical variables before drawing clinical conclusions. Prospective clinical trial data suggest that anatomic origin of colon cancer impacts prognosis and treatment effectiveness. As an initial...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6700780/ https://www.ncbi.nlm.nih.gov/pubmed/31426736 http://dx.doi.org/10.1186/s12874-019-0824-7 |
_version_ | 1783444929215725568 |
---|---|
author | Luhn, Patricia Kuk, Deborah Carrigan, Gillis Nussbaum, Nathan Sorg, Rachael Rohrer, Rebecca Tucker, Melisa G. Arnieri, Brandon Taylor, Michael D. Meropol, Neal J. |
author_facet | Luhn, Patricia Kuk, Deborah Carrigan, Gillis Nussbaum, Nathan Sorg, Rachael Rohrer, Rebecca Tucker, Melisa G. Arnieri, Brandon Taylor, Michael D. Meropol, Neal J. |
author_sort | Luhn, Patricia |
collection | PubMed |
description | BACKGROUND: The use of real-world data to generate evidence requires careful assessment and validation of critical variables before drawing clinical conclusions. Prospective clinical trial data suggest that anatomic origin of colon cancer impacts prognosis and treatment effectiveness. As an initial step in validating this observation in routine clinical settings, we explored the feasibility and accuracy of obtaining information on tumor sidedness from electronic health records (EHR) billing codes. METHODS: Nine thousand four hundred three patients with metastatic colorectal cancer (mCRC) were selected from the Flatiron Health database, which is derived from de-identified EHR data. This study included a random sample of 200 mCRC patients. Tumor site data derived from International Classification of Diseases (ICD) codes were compared with data abstracted from unstructured documents in the EHR (e.g. surgical and pathology notes). Concordance was determined via observed agreement and Cohen’s kappa coefficient (κ). Accuracy of ICD codes for each tumor site (left, right, transverse) was determined by calculating the sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV), and corresponding 95% confidence intervals, using abstracted data as the gold standard. RESULTS: Study patients had similar characteristics and side of colon distribution compared with the full mCRC dataset. The observed agreement between the ICD codes and abstracted data for tumor site for all sampled patients was 0.58 (κ = 0.41). When restricting to the 62% of patients with a side-specific ICD code, the observed agreement was 0.84 (κ = 0.79). The specificity (92–98%) of structured data for tumor location was high, with lower sensitivity (49–63%), PPV (64–92%) and NPV (72–97%). Demographic and clinical characteristics were similar between patients with specific and non-specific side of colon ICD codes. CONCLUSIONS: ICD codes are a highly reliable indicator of tumor location when the specific location code is entered in the EHR. However, non-specific side of colon ICD codes are present for a sizable minority of patients, and structured data alone may not be adequate to support testing of some research hypotheses. Careful assessment of key variables is required before determining the need for clinical abstraction to supplement structured data in generating real-world evidence from EHRs. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12874-019-0824-7) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6700780 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-67007802019-08-26 Validation of diagnosis codes to identify side of colon in an electronic health record registry Luhn, Patricia Kuk, Deborah Carrigan, Gillis Nussbaum, Nathan Sorg, Rachael Rohrer, Rebecca Tucker, Melisa G. Arnieri, Brandon Taylor, Michael D. Meropol, Neal J. BMC Med Res Methodol Research Article BACKGROUND: The use of real-world data to generate evidence requires careful assessment and validation of critical variables before drawing clinical conclusions. Prospective clinical trial data suggest that anatomic origin of colon cancer impacts prognosis and treatment effectiveness. As an initial step in validating this observation in routine clinical settings, we explored the feasibility and accuracy of obtaining information on tumor sidedness from electronic health records (EHR) billing codes. METHODS: Nine thousand four hundred three patients with metastatic colorectal cancer (mCRC) were selected from the Flatiron Health database, which is derived from de-identified EHR data. This study included a random sample of 200 mCRC patients. Tumor site data derived from International Classification of Diseases (ICD) codes were compared with data abstracted from unstructured documents in the EHR (e.g. surgical and pathology notes). Concordance was determined via observed agreement and Cohen’s kappa coefficient (κ). Accuracy of ICD codes for each tumor site (left, right, transverse) was determined by calculating the sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV), and corresponding 95% confidence intervals, using abstracted data as the gold standard. RESULTS: Study patients had similar characteristics and side of colon distribution compared with the full mCRC dataset. The observed agreement between the ICD codes and abstracted data for tumor site for all sampled patients was 0.58 (κ = 0.41). When restricting to the 62% of patients with a side-specific ICD code, the observed agreement was 0.84 (κ = 0.79). The specificity (92–98%) of structured data for tumor location was high, with lower sensitivity (49–63%), PPV (64–92%) and NPV (72–97%). Demographic and clinical characteristics were similar between patients with specific and non-specific side of colon ICD codes. CONCLUSIONS: ICD codes are a highly reliable indicator of tumor location when the specific location code is entered in the EHR. However, non-specific side of colon ICD codes are present for a sizable minority of patients, and structured data alone may not be adequate to support testing of some research hypotheses. Careful assessment of key variables is required before determining the need for clinical abstraction to supplement structured data in generating real-world evidence from EHRs. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12874-019-0824-7) contains supplementary material, which is available to authorized users. BioMed Central 2019-08-19 /pmc/articles/PMC6700780/ /pubmed/31426736 http://dx.doi.org/10.1186/s12874-019-0824-7 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Luhn, Patricia Kuk, Deborah Carrigan, Gillis Nussbaum, Nathan Sorg, Rachael Rohrer, Rebecca Tucker, Melisa G. Arnieri, Brandon Taylor, Michael D. Meropol, Neal J. Validation of diagnosis codes to identify side of colon in an electronic health record registry |
title | Validation of diagnosis codes to identify side of colon in an electronic health record registry |
title_full | Validation of diagnosis codes to identify side of colon in an electronic health record registry |
title_fullStr | Validation of diagnosis codes to identify side of colon in an electronic health record registry |
title_full_unstemmed | Validation of diagnosis codes to identify side of colon in an electronic health record registry |
title_short | Validation of diagnosis codes to identify side of colon in an electronic health record registry |
title_sort | validation of diagnosis codes to identify side of colon in an electronic health record registry |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6700780/ https://www.ncbi.nlm.nih.gov/pubmed/31426736 http://dx.doi.org/10.1186/s12874-019-0824-7 |
work_keys_str_mv | AT luhnpatricia validationofdiagnosiscodestoidentifysideofcoloninanelectronichealthrecordregistry AT kukdeborah validationofdiagnosiscodestoidentifysideofcoloninanelectronichealthrecordregistry AT carrigangillis validationofdiagnosiscodestoidentifysideofcoloninanelectronichealthrecordregistry AT nussbaumnathan validationofdiagnosiscodestoidentifysideofcoloninanelectronichealthrecordregistry AT sorgrachael validationofdiagnosiscodestoidentifysideofcoloninanelectronichealthrecordregistry AT rohrerrebecca validationofdiagnosiscodestoidentifysideofcoloninanelectronichealthrecordregistry AT tuckermelisag validationofdiagnosiscodestoidentifysideofcoloninanelectronichealthrecordregistry AT arnieribrandon validationofdiagnosiscodestoidentifysideofcoloninanelectronichealthrecordregistry AT taylormichaeld validationofdiagnosiscodestoidentifysideofcoloninanelectronichealthrecordregistry AT meropolnealj validationofdiagnosiscodestoidentifysideofcoloninanelectronichealthrecordregistry |