Cargando…
Optimal two-stage sampling for mean estimation in multilevel populations when cluster size is informative
To estimate the mean of a quantitative variable in a hierarchical population, it is logistically convenient to sample in two stages (two-stage sampling), i.e. selecting first clusters, and then individuals from the sampled clusters. Allowing cluster size to vary in the population and to be related t...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
SAGE Publications
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8172256/ https://www.ncbi.nlm.nih.gov/pubmed/32940135 http://dx.doi.org/10.1177/0962280220952833 |
_version_ | 1783702506611671040 |
---|---|
author | Innocenti, Francesco Candel, Math JJM Tan, Frans ES van Breukelen, Gerard JP |
author_facet | Innocenti, Francesco Candel, Math JJM Tan, Frans ES van Breukelen, Gerard JP |
author_sort | Innocenti, Francesco |
collection | PubMed |
description | To estimate the mean of a quantitative variable in a hierarchical population, it is logistically convenient to sample in two stages (two-stage sampling), i.e. selecting first clusters, and then individuals from the sampled clusters. Allowing cluster size to vary in the population and to be related to the mean of the outcome variable of interest (informative cluster size), the following competing sampling designs are considered: sampling clusters with probability proportional to cluster size, and then the same number of individuals per cluster; drawing clusters with equal probability, and then the same percentage of individuals per cluster; and selecting clusters with equal probability, and then the same number of individuals per cluster. For each design, optimal sample sizes are derived under a budget constraint. The three optimal two-stage sampling designs are compared, in terms of efficiency, with each other and with simple random sampling of individuals. Sampling clusters with probability proportional to size is recommended. To overcome the dependency of the optimal design on unknown nuisance parameters, maximin designs are derived. The results are illustrated, assuming probability proportional to size sampling of clusters, with the planning of a hypothetical survey to compare adolescent alcohol consumption between France and Italy. |
format | Online Article Text |
id | pubmed-8172256 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | SAGE Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-81722562021-06-21 Optimal two-stage sampling for mean estimation in multilevel populations when cluster size is informative Innocenti, Francesco Candel, Math JJM Tan, Frans ES van Breukelen, Gerard JP Stat Methods Med Res Articles To estimate the mean of a quantitative variable in a hierarchical population, it is logistically convenient to sample in two stages (two-stage sampling), i.e. selecting first clusters, and then individuals from the sampled clusters. Allowing cluster size to vary in the population and to be related to the mean of the outcome variable of interest (informative cluster size), the following competing sampling designs are considered: sampling clusters with probability proportional to cluster size, and then the same number of individuals per cluster; drawing clusters with equal probability, and then the same percentage of individuals per cluster; and selecting clusters with equal probability, and then the same number of individuals per cluster. For each design, optimal sample sizes are derived under a budget constraint. The three optimal two-stage sampling designs are compared, in terms of efficiency, with each other and with simple random sampling of individuals. Sampling clusters with probability proportional to size is recommended. To overcome the dependency of the optimal design on unknown nuisance parameters, maximin designs are derived. The results are illustrated, assuming probability proportional to size sampling of clusters, with the planning of a hypothetical survey to compare adolescent alcohol consumption between France and Italy. SAGE Publications 2020-09-17 2021-02 /pmc/articles/PMC8172256/ /pubmed/32940135 http://dx.doi.org/10.1177/0962280220952833 Text en © The Author(s) 2020 https://creativecommons.org/licenses/by-nc/4.0/This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage). |
spellingShingle | Articles Innocenti, Francesco Candel, Math JJM Tan, Frans ES van Breukelen, Gerard JP Optimal two-stage sampling for mean estimation in multilevel populations when cluster size is informative |
title | Optimal two-stage sampling for mean estimation in multilevel populations when cluster size is informative |
title_full | Optimal two-stage sampling for mean estimation in multilevel populations when cluster size is informative |
title_fullStr | Optimal two-stage sampling for mean estimation in multilevel populations when cluster size is informative |
title_full_unstemmed | Optimal two-stage sampling for mean estimation in multilevel populations when cluster size is informative |
title_short | Optimal two-stage sampling for mean estimation in multilevel populations when cluster size is informative |
title_sort | optimal two-stage sampling for mean estimation in multilevel populations when cluster size is informative |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8172256/ https://www.ncbi.nlm.nih.gov/pubmed/32940135 http://dx.doi.org/10.1177/0962280220952833 |
work_keys_str_mv | AT innocentifrancesco optimaltwostagesamplingformeanestimationinmultilevelpopulationswhenclustersizeisinformative AT candelmathjjm optimaltwostagesamplingformeanestimationinmultilevelpopulationswhenclustersizeisinformative AT tanfranses optimaltwostagesamplingformeanestimationinmultilevelpopulationswhenclustersizeisinformative AT vanbreukelengerardjp optimaltwostagesamplingformeanestimationinmultilevelpopulationswhenclustersizeisinformative |