Cargando…

Identifying influential observations in Bayesian models by using Markov chain Monte Carlo

In statistical modelling, it is often important to know how much parameter estimates are influenced by particular observations. An attractive approach is to re-estimate the parameters with each observation deleted in turn, but this is computationally demanding when fitting models by using Markov cha...

Descripción completa

Detalles Bibliográficos
Autores principales: Jackson, Dan, White, Ian R, Carpenter, James
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley & Sons, Ltd 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3500673/
https://www.ncbi.nlm.nih.gov/pubmed/21905065
http://dx.doi.org/10.1002/sim.4356
_version_ 1782250123907039232
author Jackson, Dan
White, Ian R
Carpenter, James
author_facet Jackson, Dan
White, Ian R
Carpenter, James
author_sort Jackson, Dan
collection PubMed
description In statistical modelling, it is often important to know how much parameter estimates are influenced by particular observations. An attractive approach is to re-estimate the parameters with each observation deleted in turn, but this is computationally demanding when fitting models by using Markov chain Monte Carlo (MCMC), as obtaining complete sample estimates is often in itself a very time-consuming task. Here we propose two efficient ways to approximate the case-deleted estimates by using output from MCMC estimation. Our first proposal, which directly approximates the usual influence statistics in maximum likelihood analyses of generalised linear models (GLMs), is easy to implement and avoids any further evaluation of the likelihood. Hence, unlike the existing alternatives, it does not become more computationally intensive as the model complexity increases. Our second proposal, which utilises model perturbations, also has this advantage and does not require the form of the GLM to be specified. We show how our two proposed methods are related and evaluate them against the existing method of importance sampling and case deletion in a logistic regression analysis with missing covariates. We also provide practical advice for those implementing our procedures, so that they may be used in many situations where MCMC is used to fit statistical models. Copyright © 2011 John Wiley & Sons, Ltd.
format Online
Article
Text
id pubmed-3500673
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher John Wiley & Sons, Ltd
record_format MEDLINE/PubMed
spelling pubmed-35006732012-11-26 Identifying influential observations in Bayesian models by using Markov chain Monte Carlo Jackson, Dan White, Ian R Carpenter, James Stat Med Special Issue Papers In statistical modelling, it is often important to know how much parameter estimates are influenced by particular observations. An attractive approach is to re-estimate the parameters with each observation deleted in turn, but this is computationally demanding when fitting models by using Markov chain Monte Carlo (MCMC), as obtaining complete sample estimates is often in itself a very time-consuming task. Here we propose two efficient ways to approximate the case-deleted estimates by using output from MCMC estimation. Our first proposal, which directly approximates the usual influence statistics in maximum likelihood analyses of generalised linear models (GLMs), is easy to implement and avoids any further evaluation of the likelihood. Hence, unlike the existing alternatives, it does not become more computationally intensive as the model complexity increases. Our second proposal, which utilises model perturbations, also has this advantage and does not require the form of the GLM to be specified. We show how our two proposed methods are related and evaluate them against the existing method of importance sampling and case deletion in a logistic regression analysis with missing covariates. We also provide practical advice for those implementing our procedures, so that they may be used in many situations where MCMC is used to fit statistical models. Copyright © 2011 John Wiley & Sons, Ltd. John Wiley & Sons, Ltd 2012-05-20 2011-09-08 /pmc/articles/PMC3500673/ /pubmed/21905065 http://dx.doi.org/10.1002/sim.4356 Text en Copyright © 2012 John Wiley & Sons, Ltd. http://creativecommons.org/licenses/by/2.5/ Re-use of this article is permitted in accordance with the Creative Commons Deed, Attribution 2.5, which does not permit commercial exploitation.
spellingShingle Special Issue Papers
Jackson, Dan
White, Ian R
Carpenter, James
Identifying influential observations in Bayesian models by using Markov chain Monte Carlo
title Identifying influential observations in Bayesian models by using Markov chain Monte Carlo
title_full Identifying influential observations in Bayesian models by using Markov chain Monte Carlo
title_fullStr Identifying influential observations in Bayesian models by using Markov chain Monte Carlo
title_full_unstemmed Identifying influential observations in Bayesian models by using Markov chain Monte Carlo
title_short Identifying influential observations in Bayesian models by using Markov chain Monte Carlo
title_sort identifying influential observations in bayesian models by using markov chain monte carlo
topic Special Issue Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3500673/
https://www.ncbi.nlm.nih.gov/pubmed/21905065
http://dx.doi.org/10.1002/sim.4356
work_keys_str_mv AT jacksondan identifyinginfluentialobservationsinbayesianmodelsbyusingmarkovchainmontecarlo
AT whiteianr identifyinginfluentialobservationsinbayesianmodelsbyusingmarkovchainmontecarlo
AT carpenterjames identifyinginfluentialobservationsinbayesianmodelsbyusingmarkovchainmontecarlo