Cargando…
Bayes-optimal estimation of overlap between populations of fixed size
Measuring the overlap between two populations is, in principle, straightforward. Upon fully sampling both populations, the number of shared objects—species, taxonomical units, or gene variants, depending on the context—can be directly counted. In practice, however, only a fraction of each population...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6440621/ https://www.ncbi.nlm.nih.gov/pubmed/30925165 http://dx.doi.org/10.1371/journal.pcbi.1006898 |
_version_ | 1783407422621089792 |
---|---|
author | Larremore, Daniel B. |
author_facet | Larremore, Daniel B. |
author_sort | Larremore, Daniel B. |
collection | PubMed |
description | Measuring the overlap between two populations is, in principle, straightforward. Upon fully sampling both populations, the number of shared objects—species, taxonomical units, or gene variants, depending on the context—can be directly counted. In practice, however, only a fraction of each population’s objects are likely to be sampled due to stochastic data collection or sequencing techniques. Although methods exists for quantifying population overlap under subsampled conditions, their bias is well documented and the uncertainty of their estimates cannot be quantified. Here we derive and validate a method to rigorously estimate the population overlap from incomplete samples when the total number of objects, species, or genes in each population is known, a special case of the more general β-diversity problem that is particularly relevant in the ecology and genomic epidemiology of malaria. By solving a Bayesian inference problem, this method takes into account the rates of subsampling and produces unbiased and Bayes-optimal estimates of overlap. In addition, it provides a natural framework for computing the uncertainty of its estimates, and can be used prospectively in study planning by quantifying the tradeoff between sampling effort and uncertainty. |
format | Online Article Text |
id | pubmed-6440621 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-64406212019-04-12 Bayes-optimal estimation of overlap between populations of fixed size Larremore, Daniel B. PLoS Comput Biol Research Article Measuring the overlap between two populations is, in principle, straightforward. Upon fully sampling both populations, the number of shared objects—species, taxonomical units, or gene variants, depending on the context—can be directly counted. In practice, however, only a fraction of each population’s objects are likely to be sampled due to stochastic data collection or sequencing techniques. Although methods exists for quantifying population overlap under subsampled conditions, their bias is well documented and the uncertainty of their estimates cannot be quantified. Here we derive and validate a method to rigorously estimate the population overlap from incomplete samples when the total number of objects, species, or genes in each population is known, a special case of the more general β-diversity problem that is particularly relevant in the ecology and genomic epidemiology of malaria. By solving a Bayesian inference problem, this method takes into account the rates of subsampling and produces unbiased and Bayes-optimal estimates of overlap. In addition, it provides a natural framework for computing the uncertainty of its estimates, and can be used prospectively in study planning by quantifying the tradeoff between sampling effort and uncertainty. Public Library of Science 2019-03-29 /pmc/articles/PMC6440621/ /pubmed/30925165 http://dx.doi.org/10.1371/journal.pcbi.1006898 Text en © 2019 Daniel B. Larremore http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Larremore, Daniel B. Bayes-optimal estimation of overlap between populations of fixed size |
title | Bayes-optimal estimation of overlap between populations of fixed size |
title_full | Bayes-optimal estimation of overlap between populations of fixed size |
title_fullStr | Bayes-optimal estimation of overlap between populations of fixed size |
title_full_unstemmed | Bayes-optimal estimation of overlap between populations of fixed size |
title_short | Bayes-optimal estimation of overlap between populations of fixed size |
title_sort | bayes-optimal estimation of overlap between populations of fixed size |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6440621/ https://www.ncbi.nlm.nih.gov/pubmed/30925165 http://dx.doi.org/10.1371/journal.pcbi.1006898 |
work_keys_str_mv | AT larremoredanielb bayesoptimalestimationofoverlapbetweenpopulationsoffixedsize |