Cargando…

Maximum Likelihood Estimation of the Negative Binomial Dispersion Parameter for Highly Overdispersed Data, with Applications to Infectious Diseases

BACKGROUND: The negative binomial distribution is used commonly throughout biology as a model for overdispersed count data, with attention focused on the negative binomial dispersion parameter, k. A substantial literature exists on the estimation of k, but most attention has focused on datasets that...

Descripción completa

Detalles Bibliográficos
Autor principal: Lloyd-Smith, James O.
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1791715/
https://www.ncbi.nlm.nih.gov/pubmed/17299582
http://dx.doi.org/10.1371/journal.pone.0000180
_version_ 1782132143568191488
author Lloyd-Smith, James O.
author_facet Lloyd-Smith, James O.
author_sort Lloyd-Smith, James O.
collection PubMed
description BACKGROUND: The negative binomial distribution is used commonly throughout biology as a model for overdispersed count data, with attention focused on the negative binomial dispersion parameter, k. A substantial literature exists on the estimation of k, but most attention has focused on datasets that are not highly overdispersed (i.e., those with k≥1), and the accuracy of confidence intervals estimated for k is typically not explored. METHODOLOGY: This article presents a simulation study exploring the bias, precision, and confidence interval coverage of maximum-likelihood estimates of k from highly overdispersed distributions. In addition to exploring small-sample bias on negative binomial estimates, the study addresses estimation from datasets influenced by two types of event under-counting, and from disease transmission data subject to selection bias for successful outbreaks. CONCLUSIONS: Results show that maximum likelihood estimates of k can be biased upward by small sample size or under-reporting of zero-class events, but are not biased downward by any of the factors considered. Confidence intervals estimated from the asymptotic sampling variance tend to exhibit coverage below the nominal level, with overestimates of k comprising the great majority of coverage errors. Estimation from outbreak datasets does not increase the bias of k estimates, but can add significant upward bias to estimates of the mean. Because k varies inversely with the degree of overdispersion, these findings show that overestimation of the degree of overdispersion is very rare for these datasets.
format Text
id pubmed-1791715
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-17917152007-02-14 Maximum Likelihood Estimation of the Negative Binomial Dispersion Parameter for Highly Overdispersed Data, with Applications to Infectious Diseases Lloyd-Smith, James O. PLoS One Research Article BACKGROUND: The negative binomial distribution is used commonly throughout biology as a model for overdispersed count data, with attention focused on the negative binomial dispersion parameter, k. A substantial literature exists on the estimation of k, but most attention has focused on datasets that are not highly overdispersed (i.e., those with k≥1), and the accuracy of confidence intervals estimated for k is typically not explored. METHODOLOGY: This article presents a simulation study exploring the bias, precision, and confidence interval coverage of maximum-likelihood estimates of k from highly overdispersed distributions. In addition to exploring small-sample bias on negative binomial estimates, the study addresses estimation from datasets influenced by two types of event under-counting, and from disease transmission data subject to selection bias for successful outbreaks. CONCLUSIONS: Results show that maximum likelihood estimates of k can be biased upward by small sample size or under-reporting of zero-class events, but are not biased downward by any of the factors considered. Confidence intervals estimated from the asymptotic sampling variance tend to exhibit coverage below the nominal level, with overestimates of k comprising the great majority of coverage errors. Estimation from outbreak datasets does not increase the bias of k estimates, but can add significant upward bias to estimates of the mean. Because k varies inversely with the degree of overdispersion, these findings show that overestimation of the degree of overdispersion is very rare for these datasets. Public Library of Science 2007-02-14 /pmc/articles/PMC1791715/ /pubmed/17299582 http://dx.doi.org/10.1371/journal.pone.0000180 Text en James Lloyd-Smith. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Lloyd-Smith, James O.
Maximum Likelihood Estimation of the Negative Binomial Dispersion Parameter for Highly Overdispersed Data, with Applications to Infectious Diseases
title Maximum Likelihood Estimation of the Negative Binomial Dispersion Parameter for Highly Overdispersed Data, with Applications to Infectious Diseases
title_full Maximum Likelihood Estimation of the Negative Binomial Dispersion Parameter for Highly Overdispersed Data, with Applications to Infectious Diseases
title_fullStr Maximum Likelihood Estimation of the Negative Binomial Dispersion Parameter for Highly Overdispersed Data, with Applications to Infectious Diseases
title_full_unstemmed Maximum Likelihood Estimation of the Negative Binomial Dispersion Parameter for Highly Overdispersed Data, with Applications to Infectious Diseases
title_short Maximum Likelihood Estimation of the Negative Binomial Dispersion Parameter for Highly Overdispersed Data, with Applications to Infectious Diseases
title_sort maximum likelihood estimation of the negative binomial dispersion parameter for highly overdispersed data, with applications to infectious diseases
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1791715/
https://www.ncbi.nlm.nih.gov/pubmed/17299582
http://dx.doi.org/10.1371/journal.pone.0000180
work_keys_str_mv AT lloydsmithjameso maximumlikelihoodestimationofthenegativebinomialdispersionparameterforhighlyoverdisperseddatawithapplicationstoinfectiousdiseases