Cargando…

Administrative healthcare data to predict performance status in lung cancer patients

The dataset includes 4488 patients diagnosed with lung cancer (ICD-O 3[3], C33-C34) between 2010–2012 and 2016–2018 in the territory of the Agency for Health Protection (ATS) of Milan, Italy, and selected from its population cancer registry on the basis of availability of the following information:...

Descripción completa

Detalles Bibliográficos
Autores principales: Andreano, Anita, Russo, Antonio Giampiero
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8605231/
https://www.ncbi.nlm.nih.gov/pubmed/34825030
http://dx.doi.org/10.1016/j.dib.2021.107559
_version_ 1784602133238120448
author Andreano, Anita
Russo, Antonio Giampiero
author_facet Andreano, Anita
Russo, Antonio Giampiero
author_sort Andreano, Anita
collection PubMed
description The dataset includes 4488 patients diagnosed with lung cancer (ICD-O 3[3], C33-C34) between 2010–2012 and 2016–2018 in the territory of the Agency for Health Protection (ATS) of Milan, Italy, and selected from its population cancer registry on the basis of availability of the following information: performance status (PS), age, sex, and stage at diagnosis. The dataset includes also the following variables, extracted from the health databases of the ATS and linked to the variables derived from the cancer registry through deterministic record linkage on a unique key (tax code): Charlson comorbidity index, presence of chronic obstructive pulmonary disease, number of hospitalizations, outpatient visits, emergency accesses and prescribed drugs in the previous year, and dispensed durable medical equipment in the previous three years. The dataset was used to develop a logistic prediction model for PS, dichotomized as ‘poor’ (ECOG, 3–5) and ‘good’ (ECOG, 0–2), on the basis of all other variables in the dataset. The prediction model was developed on a 50% random subsample of the described dataset (development dataset, n = 2,244) and validated on the remaining half. The area under the curve (AUC) of the model in the development and validation samples were 0.76 and 0.73, respectively. The developed model was used to predict ‘good’ vs. ‘poor’ PS in a sample of patients with advanced lung cancer, from the same registry and years, for which the information was not available. Researchers using registry data, or electronic claims, to perform studies of oncologic therapy effectiveness for lung cancer could use the reported coefficients to predict PS value, dichotomized as ‘good’ or ‘poor’.
format Online
Article
Text
id pubmed-8605231
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-86052312021-11-24 Administrative healthcare data to predict performance status in lung cancer patients Andreano, Anita Russo, Antonio Giampiero Data Brief Data Article The dataset includes 4488 patients diagnosed with lung cancer (ICD-O 3[3], C33-C34) between 2010–2012 and 2016–2018 in the territory of the Agency for Health Protection (ATS) of Milan, Italy, and selected from its population cancer registry on the basis of availability of the following information: performance status (PS), age, sex, and stage at diagnosis. The dataset includes also the following variables, extracted from the health databases of the ATS and linked to the variables derived from the cancer registry through deterministic record linkage on a unique key (tax code): Charlson comorbidity index, presence of chronic obstructive pulmonary disease, number of hospitalizations, outpatient visits, emergency accesses and prescribed drugs in the previous year, and dispensed durable medical equipment in the previous three years. The dataset was used to develop a logistic prediction model for PS, dichotomized as ‘poor’ (ECOG, 3–5) and ‘good’ (ECOG, 0–2), on the basis of all other variables in the dataset. The prediction model was developed on a 50% random subsample of the described dataset (development dataset, n = 2,244) and validated on the remaining half. The area under the curve (AUC) of the model in the development and validation samples were 0.76 and 0.73, respectively. The developed model was used to predict ‘good’ vs. ‘poor’ PS in a sample of patients with advanced lung cancer, from the same registry and years, for which the information was not available. Researchers using registry data, or electronic claims, to perform studies of oncologic therapy effectiveness for lung cancer could use the reported coefficients to predict PS value, dichotomized as ‘good’ or ‘poor’. Elsevier 2021-11-11 /pmc/articles/PMC8605231/ /pubmed/34825030 http://dx.doi.org/10.1016/j.dib.2021.107559 Text en © 2021 The Author(s). Published by Elsevier Inc. https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Data Article
Andreano, Anita
Russo, Antonio Giampiero
Administrative healthcare data to predict performance status in lung cancer patients
title Administrative healthcare data to predict performance status in lung cancer patients
title_full Administrative healthcare data to predict performance status in lung cancer patients
title_fullStr Administrative healthcare data to predict performance status in lung cancer patients
title_full_unstemmed Administrative healthcare data to predict performance status in lung cancer patients
title_short Administrative healthcare data to predict performance status in lung cancer patients
title_sort administrative healthcare data to predict performance status in lung cancer patients
topic Data Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8605231/
https://www.ncbi.nlm.nih.gov/pubmed/34825030
http://dx.doi.org/10.1016/j.dib.2021.107559
work_keys_str_mv AT andreanoanita administrativehealthcaredatatopredictperformancestatusinlungcancerpatients
AT russoantoniogiampiero administrativehealthcaredatatopredictperformancestatusinlungcancerpatients