Cargando…
Maximum Likelihood Estimation of the Negative Binomial Dispersion Parameter for Highly Overdispersed Data, with Applications to Infectious Diseases
BACKGROUND: The negative binomial distribution is used commonly throughout biology as a model for overdispersed count data, with attention focused on the negative binomial dispersion parameter, k. A substantial literature exists on the estimation of k, but most attention has focused on datasets that...
Autor principal: | |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1791715/ https://www.ncbi.nlm.nih.gov/pubmed/17299582 http://dx.doi.org/10.1371/journal.pone.0000180 |
_version_ | 1782132143568191488 |
---|---|
author | Lloyd-Smith, James O. |
author_facet | Lloyd-Smith, James O. |
author_sort | Lloyd-Smith, James O. |
collection | PubMed |
description | BACKGROUND: The negative binomial distribution is used commonly throughout biology as a model for overdispersed count data, with attention focused on the negative binomial dispersion parameter, k. A substantial literature exists on the estimation of k, but most attention has focused on datasets that are not highly overdispersed (i.e., those with k≥1), and the accuracy of confidence intervals estimated for k is typically not explored. METHODOLOGY: This article presents a simulation study exploring the bias, precision, and confidence interval coverage of maximum-likelihood estimates of k from highly overdispersed distributions. In addition to exploring small-sample bias on negative binomial estimates, the study addresses estimation from datasets influenced by two types of event under-counting, and from disease transmission data subject to selection bias for successful outbreaks. CONCLUSIONS: Results show that maximum likelihood estimates of k can be biased upward by small sample size or under-reporting of zero-class events, but are not biased downward by any of the factors considered. Confidence intervals estimated from the asymptotic sampling variance tend to exhibit coverage below the nominal level, with overestimates of k comprising the great majority of coverage errors. Estimation from outbreak datasets does not increase the bias of k estimates, but can add significant upward bias to estimates of the mean. Because k varies inversely with the degree of overdispersion, these findings show that overestimation of the degree of overdispersion is very rare for these datasets. |
format | Text |
id | pubmed-1791715 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-17917152007-02-14 Maximum Likelihood Estimation of the Negative Binomial Dispersion Parameter for Highly Overdispersed Data, with Applications to Infectious Diseases Lloyd-Smith, James O. PLoS One Research Article BACKGROUND: The negative binomial distribution is used commonly throughout biology as a model for overdispersed count data, with attention focused on the negative binomial dispersion parameter, k. A substantial literature exists on the estimation of k, but most attention has focused on datasets that are not highly overdispersed (i.e., those with k≥1), and the accuracy of confidence intervals estimated for k is typically not explored. METHODOLOGY: This article presents a simulation study exploring the bias, precision, and confidence interval coverage of maximum-likelihood estimates of k from highly overdispersed distributions. In addition to exploring small-sample bias on negative binomial estimates, the study addresses estimation from datasets influenced by two types of event under-counting, and from disease transmission data subject to selection bias for successful outbreaks. CONCLUSIONS: Results show that maximum likelihood estimates of k can be biased upward by small sample size or under-reporting of zero-class events, but are not biased downward by any of the factors considered. Confidence intervals estimated from the asymptotic sampling variance tend to exhibit coverage below the nominal level, with overestimates of k comprising the great majority of coverage errors. Estimation from outbreak datasets does not increase the bias of k estimates, but can add significant upward bias to estimates of the mean. Because k varies inversely with the degree of overdispersion, these findings show that overestimation of the degree of overdispersion is very rare for these datasets. Public Library of Science 2007-02-14 /pmc/articles/PMC1791715/ /pubmed/17299582 http://dx.doi.org/10.1371/journal.pone.0000180 Text en James Lloyd-Smith. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Lloyd-Smith, James O. Maximum Likelihood Estimation of the Negative Binomial Dispersion Parameter for Highly Overdispersed Data, with Applications to Infectious Diseases |
title | Maximum Likelihood Estimation of the Negative Binomial Dispersion Parameter for Highly Overdispersed Data, with Applications to Infectious Diseases |
title_full | Maximum Likelihood Estimation of the Negative Binomial Dispersion Parameter for Highly Overdispersed Data, with Applications to Infectious Diseases |
title_fullStr | Maximum Likelihood Estimation of the Negative Binomial Dispersion Parameter for Highly Overdispersed Data, with Applications to Infectious Diseases |
title_full_unstemmed | Maximum Likelihood Estimation of the Negative Binomial Dispersion Parameter for Highly Overdispersed Data, with Applications to Infectious Diseases |
title_short | Maximum Likelihood Estimation of the Negative Binomial Dispersion Parameter for Highly Overdispersed Data, with Applications to Infectious Diseases |
title_sort | maximum likelihood estimation of the negative binomial dispersion parameter for highly overdispersed data, with applications to infectious diseases |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1791715/ https://www.ncbi.nlm.nih.gov/pubmed/17299582 http://dx.doi.org/10.1371/journal.pone.0000180 |
work_keys_str_mv | AT lloydsmithjameso maximumlikelihoodestimationofthenegativebinomialdispersionparameterforhighlyoverdisperseddatawithapplicationstoinfectiousdiseases |