Cargando…

Development and Validation of Coding Algorithms to Identify Patients with Incident Non-Small Cell Lung Cancer in United States Healthcare Claims Data

PURPOSE: We sought to develop and validate an incident non-small cell lung cancer (NSCLC) algorithm for United States (US) healthcare claims data. Diagnoses and procedures, but not medications, were incorporated to support longer-term relevance and reliability. METHODS: Patients with newly diagnosed...

Descripción completa

Detalles Bibliográficos
Autores principales: Beyrer, Julie, Nelson, David R, Sheffield, Kristin M, Huang, Yu-Jing, Lau, Yiu-Keung, Hincapie, Ana L
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Dove 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9842515/
https://www.ncbi.nlm.nih.gov/pubmed/36659903
http://dx.doi.org/10.2147/CLEP.S389824
_version_ 1784870146091778048
author Beyrer, Julie
Nelson, David R
Sheffield, Kristin M
Huang, Yu-Jing
Lau, Yiu-Keung
Hincapie, Ana L
author_facet Beyrer, Julie
Nelson, David R
Sheffield, Kristin M
Huang, Yu-Jing
Lau, Yiu-Keung
Hincapie, Ana L
author_sort Beyrer, Julie
collection PubMed
description PURPOSE: We sought to develop and validate an incident non-small cell lung cancer (NSCLC) algorithm for United States (US) healthcare claims data. Diagnoses and procedures, but not medications, were incorporated to support longer-term relevance and reliability. METHODS: Patients with newly diagnosed NSCLC per Surveillance, Epidemiology, and End Results (SEER) served as cases. Controls included newly diagnosed small-cell lung cancer and other lung cancers, and two 5% random samples for other cancer and without cancer. Algorithms derived from logistic regression and machine learning methods used the entire sample (Approach A) or started with a previous algorithm for those with lung cancer (Approach B). Sensitivity, specificity, positive predictive values (PPV), negative predictive values, and F-scores (compared for 1000 bootstrap samples) were calculated. Misclassification was evaluated by calculating the odds of selection by the algorithm among true positives and true negatives. RESULTS: The best performing algorithm utilized neural networks (Approach B). A 10-variable point-score algorithm was derived from logistic regression (Approach B); sensitivity was 77.69% and PPV = 67.61% (F-score = 72.30%). This algorithm was less sensitive for patients ≥80 years old, with Medicare follow-up time <3 months, or missing SEER data on stage, laterality, or site and less specific for patients with SEER primary site of main bronchus, SEER summary stage 2000 regional by direct extension only, or pre-index chronic pulmonary disease. CONCLUSION: Our study developed and validated a practical, 10-variable, point-based algorithm for identifying incident NSCLC cases in a US claims database based on a previously validated incident lung cancer algorithm.
format Online
Article
Text
id pubmed-9842515
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Dove
record_format MEDLINE/PubMed
spelling pubmed-98425152023-01-18 Development and Validation of Coding Algorithms to Identify Patients with Incident Non-Small Cell Lung Cancer in United States Healthcare Claims Data Beyrer, Julie Nelson, David R Sheffield, Kristin M Huang, Yu-Jing Lau, Yiu-Keung Hincapie, Ana L Clin Epidemiol Original Research PURPOSE: We sought to develop and validate an incident non-small cell lung cancer (NSCLC) algorithm for United States (US) healthcare claims data. Diagnoses and procedures, but not medications, were incorporated to support longer-term relevance and reliability. METHODS: Patients with newly diagnosed NSCLC per Surveillance, Epidemiology, and End Results (SEER) served as cases. Controls included newly diagnosed small-cell lung cancer and other lung cancers, and two 5% random samples for other cancer and without cancer. Algorithms derived from logistic regression and machine learning methods used the entire sample (Approach A) or started with a previous algorithm for those with lung cancer (Approach B). Sensitivity, specificity, positive predictive values (PPV), negative predictive values, and F-scores (compared for 1000 bootstrap samples) were calculated. Misclassification was evaluated by calculating the odds of selection by the algorithm among true positives and true negatives. RESULTS: The best performing algorithm utilized neural networks (Approach B). A 10-variable point-score algorithm was derived from logistic regression (Approach B); sensitivity was 77.69% and PPV = 67.61% (F-score = 72.30%). This algorithm was less sensitive for patients ≥80 years old, with Medicare follow-up time <3 months, or missing SEER data on stage, laterality, or site and less specific for patients with SEER primary site of main bronchus, SEER summary stage 2000 regional by direct extension only, or pre-index chronic pulmonary disease. CONCLUSION: Our study developed and validated a practical, 10-variable, point-based algorithm for identifying incident NSCLC cases in a US claims database based on a previously validated incident lung cancer algorithm. Dove 2023-01-12 /pmc/articles/PMC9842515/ /pubmed/36659903 http://dx.doi.org/10.2147/CLEP.S389824 Text en © 2023 Beyrer et al. https://creativecommons.org/licenses/by-nc/3.0/This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution – Non Commercial (unported, v3.0) License (http://creativecommons.org/licenses/by-nc/3.0/ (https://creativecommons.org/licenses/by-nc/3.0/) ). By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms (https://www.dovepress.com/terms.php).
spellingShingle Original Research
Beyrer, Julie
Nelson, David R
Sheffield, Kristin M
Huang, Yu-Jing
Lau, Yiu-Keung
Hincapie, Ana L
Development and Validation of Coding Algorithms to Identify Patients with Incident Non-Small Cell Lung Cancer in United States Healthcare Claims Data
title Development and Validation of Coding Algorithms to Identify Patients with Incident Non-Small Cell Lung Cancer in United States Healthcare Claims Data
title_full Development and Validation of Coding Algorithms to Identify Patients with Incident Non-Small Cell Lung Cancer in United States Healthcare Claims Data
title_fullStr Development and Validation of Coding Algorithms to Identify Patients with Incident Non-Small Cell Lung Cancer in United States Healthcare Claims Data
title_full_unstemmed Development and Validation of Coding Algorithms to Identify Patients with Incident Non-Small Cell Lung Cancer in United States Healthcare Claims Data
title_short Development and Validation of Coding Algorithms to Identify Patients with Incident Non-Small Cell Lung Cancer in United States Healthcare Claims Data
title_sort development and validation of coding algorithms to identify patients with incident non-small cell lung cancer in united states healthcare claims data
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9842515/
https://www.ncbi.nlm.nih.gov/pubmed/36659903
http://dx.doi.org/10.2147/CLEP.S389824
work_keys_str_mv AT beyrerjulie developmentandvalidationofcodingalgorithmstoidentifypatientswithincidentnonsmallcelllungcancerinunitedstateshealthcareclaimsdata
AT nelsondavidr developmentandvalidationofcodingalgorithmstoidentifypatientswithincidentnonsmallcelllungcancerinunitedstateshealthcareclaimsdata
AT sheffieldkristinm developmentandvalidationofcodingalgorithmstoidentifypatientswithincidentnonsmallcelllungcancerinunitedstateshealthcareclaimsdata
AT huangyujing developmentandvalidationofcodingalgorithmstoidentifypatientswithincidentnonsmallcelllungcancerinunitedstateshealthcareclaimsdata
AT lauyiukeung developmentandvalidationofcodingalgorithmstoidentifypatientswithincidentnonsmallcelllungcancerinunitedstateshealthcareclaimsdata
AT hincapieanal developmentandvalidationofcodingalgorithmstoidentifypatientswithincidentnonsmallcelllungcancerinunitedstateshealthcareclaimsdata