Cargando…

Validation of diagnosis codes to identify side of colon in an electronic health record registry

BACKGROUND: The use of real-world data to generate evidence requires careful assessment and validation of critical variables before drawing clinical conclusions. Prospective clinical trial data suggest that anatomic origin of colon cancer impacts prognosis and treatment effectiveness. As an initial...

Descripción completa

Detalles Bibliográficos
Autores principales: Luhn, Patricia, Kuk, Deborah, Carrigan, Gillis, Nussbaum, Nathan, Sorg, Rachael, Rohrer, Rebecca, Tucker, Melisa G., Arnieri, Brandon, Taylor, Michael D., Meropol, Neal J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6700780/
https://www.ncbi.nlm.nih.gov/pubmed/31426736
http://dx.doi.org/10.1186/s12874-019-0824-7
_version_ 1783444929215725568
author Luhn, Patricia
Kuk, Deborah
Carrigan, Gillis
Nussbaum, Nathan
Sorg, Rachael
Rohrer, Rebecca
Tucker, Melisa G.
Arnieri, Brandon
Taylor, Michael D.
Meropol, Neal J.
author_facet Luhn, Patricia
Kuk, Deborah
Carrigan, Gillis
Nussbaum, Nathan
Sorg, Rachael
Rohrer, Rebecca
Tucker, Melisa G.
Arnieri, Brandon
Taylor, Michael D.
Meropol, Neal J.
author_sort Luhn, Patricia
collection PubMed
description BACKGROUND: The use of real-world data to generate evidence requires careful assessment and validation of critical variables before drawing clinical conclusions. Prospective clinical trial data suggest that anatomic origin of colon cancer impacts prognosis and treatment effectiveness. As an initial step in validating this observation in routine clinical settings, we explored the feasibility and accuracy of obtaining information on tumor sidedness from electronic health records (EHR) billing codes. METHODS: Nine thousand four hundred three patients with metastatic colorectal cancer (mCRC) were selected from the Flatiron Health database, which is derived from de-identified EHR data. This study included a random sample of 200 mCRC patients. Tumor site data derived from International Classification of Diseases (ICD) codes were compared with data abstracted from unstructured documents in the EHR (e.g. surgical and pathology notes). Concordance was determined via observed agreement and Cohen’s kappa coefficient (κ). Accuracy of ICD codes for each tumor site (left, right, transverse) was determined by calculating the sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV), and corresponding 95% confidence intervals, using abstracted data as the gold standard. RESULTS: Study patients had similar characteristics and side of colon distribution compared with the full mCRC dataset. The observed agreement between the ICD codes and abstracted data for tumor site for all sampled patients was 0.58 (κ = 0.41). When restricting to the 62% of patients with a side-specific ICD code, the observed agreement was 0.84 (κ = 0.79). The specificity (92–98%) of structured data for tumor location was high, with lower sensitivity (49–63%), PPV (64–92%) and NPV (72–97%). Demographic and clinical characteristics were similar between patients with specific and non-specific side of colon ICD codes. CONCLUSIONS: ICD codes are a highly reliable indicator of tumor location when the specific location code is entered in the EHR. However, non-specific side of colon ICD codes are present for a sizable minority of patients, and structured data alone may not be adequate to support testing of some research hypotheses. Careful assessment of key variables is required before determining the need for clinical abstraction to supplement structured data in generating real-world evidence from EHRs. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12874-019-0824-7) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6700780
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-67007802019-08-26 Validation of diagnosis codes to identify side of colon in an electronic health record registry Luhn, Patricia Kuk, Deborah Carrigan, Gillis Nussbaum, Nathan Sorg, Rachael Rohrer, Rebecca Tucker, Melisa G. Arnieri, Brandon Taylor, Michael D. Meropol, Neal J. BMC Med Res Methodol Research Article BACKGROUND: The use of real-world data to generate evidence requires careful assessment and validation of critical variables before drawing clinical conclusions. Prospective clinical trial data suggest that anatomic origin of colon cancer impacts prognosis and treatment effectiveness. As an initial step in validating this observation in routine clinical settings, we explored the feasibility and accuracy of obtaining information on tumor sidedness from electronic health records (EHR) billing codes. METHODS: Nine thousand four hundred three patients with metastatic colorectal cancer (mCRC) were selected from the Flatiron Health database, which is derived from de-identified EHR data. This study included a random sample of 200 mCRC patients. Tumor site data derived from International Classification of Diseases (ICD) codes were compared with data abstracted from unstructured documents in the EHR (e.g. surgical and pathology notes). Concordance was determined via observed agreement and Cohen’s kappa coefficient (κ). Accuracy of ICD codes for each tumor site (left, right, transverse) was determined by calculating the sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV), and corresponding 95% confidence intervals, using abstracted data as the gold standard. RESULTS: Study patients had similar characteristics and side of colon distribution compared with the full mCRC dataset. The observed agreement between the ICD codes and abstracted data for tumor site for all sampled patients was 0.58 (κ = 0.41). When restricting to the 62% of patients with a side-specific ICD code, the observed agreement was 0.84 (κ = 0.79). The specificity (92–98%) of structured data for tumor location was high, with lower sensitivity (49–63%), PPV (64–92%) and NPV (72–97%). Demographic and clinical characteristics were similar between patients with specific and non-specific side of colon ICD codes. CONCLUSIONS: ICD codes are a highly reliable indicator of tumor location when the specific location code is entered in the EHR. However, non-specific side of colon ICD codes are present for a sizable minority of patients, and structured data alone may not be adequate to support testing of some research hypotheses. Careful assessment of key variables is required before determining the need for clinical abstraction to supplement structured data in generating real-world evidence from EHRs. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12874-019-0824-7) contains supplementary material, which is available to authorized users. BioMed Central 2019-08-19 /pmc/articles/PMC6700780/ /pubmed/31426736 http://dx.doi.org/10.1186/s12874-019-0824-7 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Luhn, Patricia
Kuk, Deborah
Carrigan, Gillis
Nussbaum, Nathan
Sorg, Rachael
Rohrer, Rebecca
Tucker, Melisa G.
Arnieri, Brandon
Taylor, Michael D.
Meropol, Neal J.
Validation of diagnosis codes to identify side of colon in an electronic health record registry
title Validation of diagnosis codes to identify side of colon in an electronic health record registry
title_full Validation of diagnosis codes to identify side of colon in an electronic health record registry
title_fullStr Validation of diagnosis codes to identify side of colon in an electronic health record registry
title_full_unstemmed Validation of diagnosis codes to identify side of colon in an electronic health record registry
title_short Validation of diagnosis codes to identify side of colon in an electronic health record registry
title_sort validation of diagnosis codes to identify side of colon in an electronic health record registry
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6700780/
https://www.ncbi.nlm.nih.gov/pubmed/31426736
http://dx.doi.org/10.1186/s12874-019-0824-7
work_keys_str_mv AT luhnpatricia validationofdiagnosiscodestoidentifysideofcoloninanelectronichealthrecordregistry
AT kukdeborah validationofdiagnosiscodestoidentifysideofcoloninanelectronichealthrecordregistry
AT carrigangillis validationofdiagnosiscodestoidentifysideofcoloninanelectronichealthrecordregistry
AT nussbaumnathan validationofdiagnosiscodestoidentifysideofcoloninanelectronichealthrecordregistry
AT sorgrachael validationofdiagnosiscodestoidentifysideofcoloninanelectronichealthrecordregistry
AT rohrerrebecca validationofdiagnosiscodestoidentifysideofcoloninanelectronichealthrecordregistry
AT tuckermelisag validationofdiagnosiscodestoidentifysideofcoloninanelectronichealthrecordregistry
AT arnieribrandon validationofdiagnosiscodestoidentifysideofcoloninanelectronichealthrecordregistry
AT taylormichaeld validationofdiagnosiscodestoidentifysideofcoloninanelectronichealthrecordregistry
AT meropolnealj validationofdiagnosiscodestoidentifysideofcoloninanelectronichealthrecordregistry