Cargando…

Methods for significance testing of categorical covariates in logistic regression models after multiple imputation: power and applicability analysis

BACKGROUND: Multiple imputation is a recommended method to handle missing data. For significance testing after multiple imputation, Rubin’s Rules (RR) are easily applied to pool parameter estimates. In a logistic regression model, to consider whether a categorical covariate with more than two levels...

Descripción completa

Detalles Bibliográficos
Autores principales:	Eekhout, Iris, van de Wiel, Mark A., Heymans, Martijn W.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2017
Materias:	Technical Advance
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5568368/ https://www.ncbi.nlm.nih.gov/pubmed/28830466 http://dx.doi.org/10.1186/s12874-017-0404-7

_version_	1783258849494433792
author	Eekhout, Iris van de Wiel, Mark A. Heymans, Martijn W.
author_facet	Eekhout, Iris van de Wiel, Mark A. Heymans, Martijn W.
author_sort	Eekhout, Iris
collection	PubMed
description	BACKGROUND: Multiple imputation is a recommended method to handle missing data. For significance testing after multiple imputation, Rubin’s Rules (RR) are easily applied to pool parameter estimates. In a logistic regression model, to consider whether a categorical covariate with more than two levels significantly contributes to the model, different methods are available. For example pooling chi-square tests with multiple degrees of freedom, pooling likelihood ratio test statistics, and pooling based on the covariance matrix of the regression model. These methods are more complex than RR and are not available in all mainstream statistical software packages. In addition, they do not always obtain optimal power levels. We argue that the median of the p-values from the overall significance tests from the analyses on the imputed datasets can be used as an alternative pooling rule for categorical variables. The aim of the current study is to compare different methods to test a categorical variable for significance after multiple imputation on applicability and power. METHODS: In a large simulation study, we demonstrated the control of the type I error and power levels of different pooling methods for categorical variables. RESULTS: This simulation study showed that for non-significant categorical covariates the type I error is controlled and the statistical power of the median pooling rule was at least equal to current multiple parameter tests. An empirical data example showed similar results. CONCLUSIONS: It can therefore be concluded that using the median of the p-values from the imputed data analyses is an attractive and easy to use alternative method for significance testing of categorical variables. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12874-017-0404-7) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-5568368
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-55683682017-08-29 Methods for significance testing of categorical covariates in logistic regression models after multiple imputation: power and applicability analysis Eekhout, Iris van de Wiel, Mark A. Heymans, Martijn W. BMC Med Res Methodol Technical Advance BACKGROUND: Multiple imputation is a recommended method to handle missing data. For significance testing after multiple imputation, Rubin’s Rules (RR) are easily applied to pool parameter estimates. In a logistic regression model, to consider whether a categorical covariate with more than two levels significantly contributes to the model, different methods are available. For example pooling chi-square tests with multiple degrees of freedom, pooling likelihood ratio test statistics, and pooling based on the covariance matrix of the regression model. These methods are more complex than RR and are not available in all mainstream statistical software packages. In addition, they do not always obtain optimal power levels. We argue that the median of the p-values from the overall significance tests from the analyses on the imputed datasets can be used as an alternative pooling rule for categorical variables. The aim of the current study is to compare different methods to test a categorical variable for significance after multiple imputation on applicability and power. METHODS: In a large simulation study, we demonstrated the control of the type I error and power levels of different pooling methods for categorical variables. RESULTS: This simulation study showed that for non-significant categorical covariates the type I error is controlled and the statistical power of the median pooling rule was at least equal to current multiple parameter tests. An empirical data example showed similar results. CONCLUSIONS: It can therefore be concluded that using the median of the p-values from the imputed data analyses is an attractive and easy to use alternative method for significance testing of categorical variables. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12874-017-0404-7) contains supplementary material, which is available to authorized users. BioMed Central 2017-08-22 /pmc/articles/PMC5568368/ /pubmed/28830466 http://dx.doi.org/10.1186/s12874-017-0404-7 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Technical Advance Eekhout, Iris van de Wiel, Mark A. Heymans, Martijn W. Methods for significance testing of categorical covariates in logistic regression models after multiple imputation: power and applicability analysis
title	Methods for significance testing of categorical covariates in logistic regression models after multiple imputation: power and applicability analysis
title_full	Methods for significance testing of categorical covariates in logistic regression models after multiple imputation: power and applicability analysis
title_fullStr	Methods for significance testing of categorical covariates in logistic regression models after multiple imputation: power and applicability analysis
title_full_unstemmed	Methods for significance testing of categorical covariates in logistic regression models after multiple imputation: power and applicability analysis
title_short	Methods for significance testing of categorical covariates in logistic regression models after multiple imputation: power and applicability analysis
title_sort	methods for significance testing of categorical covariates in logistic regression models after multiple imputation: power and applicability analysis
topic	Technical Advance
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5568368/ https://www.ncbi.nlm.nih.gov/pubmed/28830466 http://dx.doi.org/10.1186/s12874-017-0404-7
work_keys_str_mv	AT eekhoutiris methodsforsignificancetestingofcategoricalcovariatesinlogisticregressionmodelsaftermultipleimputationpowerandapplicabilityanalysis AT vandewielmarka methodsforsignificancetestingofcategoricalcovariatesinlogisticregressionmodelsaftermultipleimputationpowerandapplicabilityanalysis AT heymansmartijnw methodsforsignificancetestingofcategoricalcovariatesinlogisticregressionmodelsaftermultipleimputationpowerandapplicabilityanalysis

Methods for significance testing of categorical covariates in logistic regression models after multiple imputation: power and applicability analysis

Ejemplares similares