Cargando…
Imputing pre-diagnosis health behaviour in cancer registry data and investigating its relationship with oesophageal cancer survival time
BACKGROUND: As oesophageal cancer has short survival, it is likely pre-diagnosis health behaviours will have carry-over effects on post-diagnosis survival times. Cancer registry data sets do not usually contain pre-diagnosis health behaviours and so need to be augmented with data from external healt...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8670692/ https://www.ncbi.nlm.nih.gov/pubmed/34905568 http://dx.doi.org/10.1371/journal.pone.0261416 |
_version_ | 1784615016221114368 |
---|---|
author | Fahey, Paul P. Page, Andrew Astell-Burt, Thomas Stone, Glenn |
author_facet | Fahey, Paul P. Page, Andrew Astell-Burt, Thomas Stone, Glenn |
author_sort | Fahey, Paul P. |
collection | PubMed |
description | BACKGROUND: As oesophageal cancer has short survival, it is likely pre-diagnosis health behaviours will have carry-over effects on post-diagnosis survival times. Cancer registry data sets do not usually contain pre-diagnosis health behaviours and so need to be augmented with data from external health surveys. A new algorithm is introduced and tested to augment cancer registries with external data when one-to-one data linkage is not available. METHODS: The algorithm is to use external health survey data to impute pre-diagnosis health behaviour for cancer patients, estimate misclassification errors in these imputed values and then fit misclassification corrected Cox regression to quantify the association between pre-diagnosis health behaviour and post-diagnosis survival. Data from US cancer registries and a US national health survey are used in testing the algorithm. RESULTS: It is demonstrated that the algorithm works effectively on simulated smoking data when there is no age confounding. But age confounding does exist (risk of death increases with age and most health behaviours change with age) and interferes with the performance of the algorithm. The estimate of the hazard ratio (HR) of pre-diagnosis smoking was HR = 1.32 (95% CI 0.82,2.68) with HR = 1.93 (95% CI 1.08,7.07) in the squamous cell sub-group and pre-diagnosis physical activity was protective of survival with HR = 0.25 (95% CI 0.03, 0.81). But the method failed for less common behaviours (such as heavy drinking). CONCLUSIONS: Further improvements in the I2C2 algorithm will permit enrichment of cancer registry data through imputation of new variables with negligible risk to patient confidentiality, opening new research opportunities in cancer epidemiology. |
format | Online Article Text |
id | pubmed-8670692 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-86706922021-12-15 Imputing pre-diagnosis health behaviour in cancer registry data and investigating its relationship with oesophageal cancer survival time Fahey, Paul P. Page, Andrew Astell-Burt, Thomas Stone, Glenn PLoS One Research Article BACKGROUND: As oesophageal cancer has short survival, it is likely pre-diagnosis health behaviours will have carry-over effects on post-diagnosis survival times. Cancer registry data sets do not usually contain pre-diagnosis health behaviours and so need to be augmented with data from external health surveys. A new algorithm is introduced and tested to augment cancer registries with external data when one-to-one data linkage is not available. METHODS: The algorithm is to use external health survey data to impute pre-diagnosis health behaviour for cancer patients, estimate misclassification errors in these imputed values and then fit misclassification corrected Cox regression to quantify the association between pre-diagnosis health behaviour and post-diagnosis survival. Data from US cancer registries and a US national health survey are used in testing the algorithm. RESULTS: It is demonstrated that the algorithm works effectively on simulated smoking data when there is no age confounding. But age confounding does exist (risk of death increases with age and most health behaviours change with age) and interferes with the performance of the algorithm. The estimate of the hazard ratio (HR) of pre-diagnosis smoking was HR = 1.32 (95% CI 0.82,2.68) with HR = 1.93 (95% CI 1.08,7.07) in the squamous cell sub-group and pre-diagnosis physical activity was protective of survival with HR = 0.25 (95% CI 0.03, 0.81). But the method failed for less common behaviours (such as heavy drinking). CONCLUSIONS: Further improvements in the I2C2 algorithm will permit enrichment of cancer registry data through imputation of new variables with negligible risk to patient confidentiality, opening new research opportunities in cancer epidemiology. Public Library of Science 2021-12-14 /pmc/articles/PMC8670692/ /pubmed/34905568 http://dx.doi.org/10.1371/journal.pone.0261416 Text en © 2021 Fahey et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Fahey, Paul P. Page, Andrew Astell-Burt, Thomas Stone, Glenn Imputing pre-diagnosis health behaviour in cancer registry data and investigating its relationship with oesophageal cancer survival time |
title | Imputing pre-diagnosis health behaviour in cancer registry data and investigating its relationship with oesophageal cancer survival time |
title_full | Imputing pre-diagnosis health behaviour in cancer registry data and investigating its relationship with oesophageal cancer survival time |
title_fullStr | Imputing pre-diagnosis health behaviour in cancer registry data and investigating its relationship with oesophageal cancer survival time |
title_full_unstemmed | Imputing pre-diagnosis health behaviour in cancer registry data and investigating its relationship with oesophageal cancer survival time |
title_short | Imputing pre-diagnosis health behaviour in cancer registry data and investigating its relationship with oesophageal cancer survival time |
title_sort | imputing pre-diagnosis health behaviour in cancer registry data and investigating its relationship with oesophageal cancer survival time |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8670692/ https://www.ncbi.nlm.nih.gov/pubmed/34905568 http://dx.doi.org/10.1371/journal.pone.0261416 |
work_keys_str_mv | AT faheypaulp imputingprediagnosishealthbehaviourincancerregistrydataandinvestigatingitsrelationshipwithoesophagealcancersurvivaltime AT pageandrew imputingprediagnosishealthbehaviourincancerregistrydataandinvestigatingitsrelationshipwithoesophagealcancersurvivaltime AT astellburtthomas imputingprediagnosishealthbehaviourincancerregistrydataandinvestigatingitsrelationshipwithoesophagealcancersurvivaltime AT stoneglenn imputingprediagnosishealthbehaviourincancerregistrydataandinvestigatingitsrelationshipwithoesophagealcancersurvivaltime |