Cargando…

Imputing pre-diagnosis health behaviour in cancer registry data and investigating its relationship with oesophageal cancer survival time

BACKGROUND: As oesophageal cancer has short survival, it is likely pre-diagnosis health behaviours will have carry-over effects on post-diagnosis survival times. Cancer registry data sets do not usually contain pre-diagnosis health behaviours and so need to be augmented with data from external healt...

Descripción completa

Detalles Bibliográficos
Autores principales: Fahey, Paul P., Page, Andrew, Astell-Burt, Thomas, Stone, Glenn
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8670692/
https://www.ncbi.nlm.nih.gov/pubmed/34905568
http://dx.doi.org/10.1371/journal.pone.0261416
_version_ 1784615016221114368
author Fahey, Paul P.
Page, Andrew
Astell-Burt, Thomas
Stone, Glenn
author_facet Fahey, Paul P.
Page, Andrew
Astell-Burt, Thomas
Stone, Glenn
author_sort Fahey, Paul P.
collection PubMed
description BACKGROUND: As oesophageal cancer has short survival, it is likely pre-diagnosis health behaviours will have carry-over effects on post-diagnosis survival times. Cancer registry data sets do not usually contain pre-diagnosis health behaviours and so need to be augmented with data from external health surveys. A new algorithm is introduced and tested to augment cancer registries with external data when one-to-one data linkage is not available. METHODS: The algorithm is to use external health survey data to impute pre-diagnosis health behaviour for cancer patients, estimate misclassification errors in these imputed values and then fit misclassification corrected Cox regression to quantify the association between pre-diagnosis health behaviour and post-diagnosis survival. Data from US cancer registries and a US national health survey are used in testing the algorithm. RESULTS: It is demonstrated that the algorithm works effectively on simulated smoking data when there is no age confounding. But age confounding does exist (risk of death increases with age and most health behaviours change with age) and interferes with the performance of the algorithm. The estimate of the hazard ratio (HR) of pre-diagnosis smoking was HR = 1.32 (95% CI 0.82,2.68) with HR = 1.93 (95% CI 1.08,7.07) in the squamous cell sub-group and pre-diagnosis physical activity was protective of survival with HR = 0.25 (95% CI 0.03, 0.81). But the method failed for less common behaviours (such as heavy drinking). CONCLUSIONS: Further improvements in the I2C2 algorithm will permit enrichment of cancer registry data through imputation of new variables with negligible risk to patient confidentiality, opening new research opportunities in cancer epidemiology.
format Online
Article
Text
id pubmed-8670692
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-86706922021-12-15 Imputing pre-diagnosis health behaviour in cancer registry data and investigating its relationship with oesophageal cancer survival time Fahey, Paul P. Page, Andrew Astell-Burt, Thomas Stone, Glenn PLoS One Research Article BACKGROUND: As oesophageal cancer has short survival, it is likely pre-diagnosis health behaviours will have carry-over effects on post-diagnosis survival times. Cancer registry data sets do not usually contain pre-diagnosis health behaviours and so need to be augmented with data from external health surveys. A new algorithm is introduced and tested to augment cancer registries with external data when one-to-one data linkage is not available. METHODS: The algorithm is to use external health survey data to impute pre-diagnosis health behaviour for cancer patients, estimate misclassification errors in these imputed values and then fit misclassification corrected Cox regression to quantify the association between pre-diagnosis health behaviour and post-diagnosis survival. Data from US cancer registries and a US national health survey are used in testing the algorithm. RESULTS: It is demonstrated that the algorithm works effectively on simulated smoking data when there is no age confounding. But age confounding does exist (risk of death increases with age and most health behaviours change with age) and interferes with the performance of the algorithm. The estimate of the hazard ratio (HR) of pre-diagnosis smoking was HR = 1.32 (95% CI 0.82,2.68) with HR = 1.93 (95% CI 1.08,7.07) in the squamous cell sub-group and pre-diagnosis physical activity was protective of survival with HR = 0.25 (95% CI 0.03, 0.81). But the method failed for less common behaviours (such as heavy drinking). CONCLUSIONS: Further improvements in the I2C2 algorithm will permit enrichment of cancer registry data through imputation of new variables with negligible risk to patient confidentiality, opening new research opportunities in cancer epidemiology. Public Library of Science 2021-12-14 /pmc/articles/PMC8670692/ /pubmed/34905568 http://dx.doi.org/10.1371/journal.pone.0261416 Text en © 2021 Fahey et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Fahey, Paul P.
Page, Andrew
Astell-Burt, Thomas
Stone, Glenn
Imputing pre-diagnosis health behaviour in cancer registry data and investigating its relationship with oesophageal cancer survival time
title Imputing pre-diagnosis health behaviour in cancer registry data and investigating its relationship with oesophageal cancer survival time
title_full Imputing pre-diagnosis health behaviour in cancer registry data and investigating its relationship with oesophageal cancer survival time
title_fullStr Imputing pre-diagnosis health behaviour in cancer registry data and investigating its relationship with oesophageal cancer survival time
title_full_unstemmed Imputing pre-diagnosis health behaviour in cancer registry data and investigating its relationship with oesophageal cancer survival time
title_short Imputing pre-diagnosis health behaviour in cancer registry data and investigating its relationship with oesophageal cancer survival time
title_sort imputing pre-diagnosis health behaviour in cancer registry data and investigating its relationship with oesophageal cancer survival time
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8670692/
https://www.ncbi.nlm.nih.gov/pubmed/34905568
http://dx.doi.org/10.1371/journal.pone.0261416
work_keys_str_mv AT faheypaulp imputingprediagnosishealthbehaviourincancerregistrydataandinvestigatingitsrelationshipwithoesophagealcancersurvivaltime
AT pageandrew imputingprediagnosishealthbehaviourincancerregistrydataandinvestigatingitsrelationshipwithoesophagealcancersurvivaltime
AT astellburtthomas imputingprediagnosishealthbehaviourincancerregistrydataandinvestigatingitsrelationshipwithoesophagealcancersurvivaltime
AT stoneglenn imputingprediagnosishealthbehaviourincancerregistrydataandinvestigatingitsrelationshipwithoesophagealcancersurvivaltime