Cargando…

Imputation of missing values of tumour stage in population-based cancer registration

BACKGROUND: Missing data on tumour stage information is a common problem in population-based cancer registries. Statistical analyses on the level of tumour stage may be biased, if no adequate method for handling of missing data is applied. In order to determine a useful way to treat missing data on...

Descripción completa

Detalles Bibliográficos
Autores principales: Eisemann, Nora, Waldmann, Annika, Katalinic, Alexander
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3184281/
https://www.ncbi.nlm.nih.gov/pubmed/21929796
http://dx.doi.org/10.1186/1471-2288-11-129
_version_ 1782213087584059392
author Eisemann, Nora
Waldmann, Annika
Katalinic, Alexander
author_facet Eisemann, Nora
Waldmann, Annika
Katalinic, Alexander
author_sort Eisemann, Nora
collection PubMed
description BACKGROUND: Missing data on tumour stage information is a common problem in population-based cancer registries. Statistical analyses on the level of tumour stage may be biased, if no adequate method for handling of missing data is applied. In order to determine a useful way to treat missing data on tumour stage, we examined different imputation models for multiple imputation with chained equations for analysing the stage-specific numbers of cases of malignant melanoma and female breast cancer. METHODS: This analysis was based on the malignant melanoma data set and the female breast cancer data set of the cancer registry Schleswig-Holstein, Germany. The cases with complete tumour stage information were extracted and their stage information partly removed according to a MAR missingness-pattern, resulting in five simulated data sets for each cancer entity. The missing tumour stage values were then treated with multiple imputation with chained equations, using polytomous regression, predictive mean matching, random forests and proportional sampling as imputation models. The estimated tumour stages, stage-specific numbers of cases and survival curves after multiple imputation were compared to the observed ones. RESULTS: The amount of missing values for malignant melanoma was too high to estimate a reasonable number of cases for each UICC stage. However, multiple imputation of missing stage values led to stage-specific numbers of cases of T-stage for malignant melanoma as well as T- and UICC-stage for breast cancer close to the observed numbers of cases. The observed tumour stages on the individual level, the stage-specific numbers of cases and the observed survival curves were best met with polytomous regression or predictive mean matching but not with random forest or proportional sampling as imputation models. CONCLUSIONS: This limited simulation study indicates that multiple imputation with chained equations is an appropriate technique for dealing with missing information on tumour stage in population-based cancer registries, if the amount of unstaged cases is on a reasonable level.
format Online
Article
Text
id pubmed-3184281
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31842812011-10-02 Imputation of missing values of tumour stage in population-based cancer registration Eisemann, Nora Waldmann, Annika Katalinic, Alexander BMC Med Res Methodol Research Article BACKGROUND: Missing data on tumour stage information is a common problem in population-based cancer registries. Statistical analyses on the level of tumour stage may be biased, if no adequate method for handling of missing data is applied. In order to determine a useful way to treat missing data on tumour stage, we examined different imputation models for multiple imputation with chained equations for analysing the stage-specific numbers of cases of malignant melanoma and female breast cancer. METHODS: This analysis was based on the malignant melanoma data set and the female breast cancer data set of the cancer registry Schleswig-Holstein, Germany. The cases with complete tumour stage information were extracted and their stage information partly removed according to a MAR missingness-pattern, resulting in five simulated data sets for each cancer entity. The missing tumour stage values were then treated with multiple imputation with chained equations, using polytomous regression, predictive mean matching, random forests and proportional sampling as imputation models. The estimated tumour stages, stage-specific numbers of cases and survival curves after multiple imputation were compared to the observed ones. RESULTS: The amount of missing values for malignant melanoma was too high to estimate a reasonable number of cases for each UICC stage. However, multiple imputation of missing stage values led to stage-specific numbers of cases of T-stage for malignant melanoma as well as T- and UICC-stage for breast cancer close to the observed numbers of cases. The observed tumour stages on the individual level, the stage-specific numbers of cases and the observed survival curves were best met with polytomous regression or predictive mean matching but not with random forest or proportional sampling as imputation models. CONCLUSIONS: This limited simulation study indicates that multiple imputation with chained equations is an appropriate technique for dealing with missing information on tumour stage in population-based cancer registries, if the amount of unstaged cases is on a reasonable level. BioMed Central 2011-09-19 /pmc/articles/PMC3184281/ /pubmed/21929796 http://dx.doi.org/10.1186/1471-2288-11-129 Text en Copyright ©2011 Eisemann et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Eisemann, Nora
Waldmann, Annika
Katalinic, Alexander
Imputation of missing values of tumour stage in population-based cancer registration
title Imputation of missing values of tumour stage in population-based cancer registration
title_full Imputation of missing values of tumour stage in population-based cancer registration
title_fullStr Imputation of missing values of tumour stage in population-based cancer registration
title_full_unstemmed Imputation of missing values of tumour stage in population-based cancer registration
title_short Imputation of missing values of tumour stage in population-based cancer registration
title_sort imputation of missing values of tumour stage in population-based cancer registration
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3184281/
https://www.ncbi.nlm.nih.gov/pubmed/21929796
http://dx.doi.org/10.1186/1471-2288-11-129
work_keys_str_mv AT eisemannnora imputationofmissingvaluesoftumourstageinpopulationbasedcancerregistration
AT waldmannannika imputationofmissingvaluesoftumourstageinpopulationbasedcancerregistration
AT katalinicalexander imputationofmissingvaluesoftumourstageinpopulationbasedcancerregistration