Cargando…
Power estimation and sample size determination for replication studies of genome-wide association studies
BACKGROUND: Replication study is a commonly used verification method to filter out false positives in genome-wide association studies (GWAS). If an association can be confirmed in a replication study, it will have a high confidence to be true positive. To design a replication study, traditional appr...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4895704/ https://www.ncbi.nlm.nih.gov/pubmed/26818952 http://dx.doi.org/10.1186/s12864-015-2296-4 |
_version_ | 1782435905214087168 |
---|---|
author | Jiang, Wei Yu, Weichuan |
author_facet | Jiang, Wei Yu, Weichuan |
author_sort | Jiang, Wei |
collection | PubMed |
description | BACKGROUND: Replication study is a commonly used verification method to filter out false positives in genome-wide association studies (GWAS). If an association can be confirmed in a replication study, it will have a high confidence to be true positive. To design a replication study, traditional approaches calculate power by treating replication study as another independent primary study. These approaches do not use the information given by primary study. Besides, they need to specify a minimum detectable effect size, which may be subjective. One may think to replace the minimum effect size with the observed effect sizes in the power calculation. However, this approach will make the designed replication study underpowered since we are only interested in the positive associations from the primary study and the problem of the “winner’s curse” will occur. RESULTS: An Empirical Bayes (EB) based method is proposed to estimate the power of replication study for each association. The corresponding credible interval is estimated in the proposed approach. Simulation experiments show that our method is better than other plug-in based estimators in terms of overcoming the winner’s curse and providing higher estimation accuracy. The coverage probability of given credible interval is well-calibrated in the simulation experiments. Weighted average method is used to estimate the average power of all underlying true associations. This is used to determine the sample size of replication study. Sample sizes are estimated on 6 diseases from Wellcome Trust Case Control Consortium (WTCCC) using our method. They are higher than sample sizes estimated by plugging observed effect sizes in power calculation. CONCLUSIONS: Our new method can objectively determine replication study’s sample size by using information extracted from primary study. Also the winner’s curse is alleviated. Thus, it is a better choice when designing replication studies of GWAS. The R-package is available at: http://bioinformatics.ust.hk/RPower.html. |
format | Online Article Text |
id | pubmed-4895704 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-48957042016-06-10 Power estimation and sample size determination for replication studies of genome-wide association studies Jiang, Wei Yu, Weichuan BMC Genomics Methodology BACKGROUND: Replication study is a commonly used verification method to filter out false positives in genome-wide association studies (GWAS). If an association can be confirmed in a replication study, it will have a high confidence to be true positive. To design a replication study, traditional approaches calculate power by treating replication study as another independent primary study. These approaches do not use the information given by primary study. Besides, they need to specify a minimum detectable effect size, which may be subjective. One may think to replace the minimum effect size with the observed effect sizes in the power calculation. However, this approach will make the designed replication study underpowered since we are only interested in the positive associations from the primary study and the problem of the “winner’s curse” will occur. RESULTS: An Empirical Bayes (EB) based method is proposed to estimate the power of replication study for each association. The corresponding credible interval is estimated in the proposed approach. Simulation experiments show that our method is better than other plug-in based estimators in terms of overcoming the winner’s curse and providing higher estimation accuracy. The coverage probability of given credible interval is well-calibrated in the simulation experiments. Weighted average method is used to estimate the average power of all underlying true associations. This is used to determine the sample size of replication study. Sample sizes are estimated on 6 diseases from Wellcome Trust Case Control Consortium (WTCCC) using our method. They are higher than sample sizes estimated by plugging observed effect sizes in power calculation. CONCLUSIONS: Our new method can objectively determine replication study’s sample size by using information extracted from primary study. Also the winner’s curse is alleviated. Thus, it is a better choice when designing replication studies of GWAS. The R-package is available at: http://bioinformatics.ust.hk/RPower.html. BioMed Central 2016-01-11 /pmc/articles/PMC4895704/ /pubmed/26818952 http://dx.doi.org/10.1186/s12864-015-2296-4 Text en © Jiang and Yu. 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Jiang, Wei Yu, Weichuan Power estimation and sample size determination for replication studies of genome-wide association studies |
title | Power estimation and sample size determination for replication studies of genome-wide association studies |
title_full | Power estimation and sample size determination for replication studies of genome-wide association studies |
title_fullStr | Power estimation and sample size determination for replication studies of genome-wide association studies |
title_full_unstemmed | Power estimation and sample size determination for replication studies of genome-wide association studies |
title_short | Power estimation and sample size determination for replication studies of genome-wide association studies |
title_sort | power estimation and sample size determination for replication studies of genome-wide association studies |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4895704/ https://www.ncbi.nlm.nih.gov/pubmed/26818952 http://dx.doi.org/10.1186/s12864-015-2296-4 |
work_keys_str_mv | AT jiangwei powerestimationandsamplesizedeterminationforreplicationstudiesofgenomewideassociationstudies AT yuweichuan powerestimationandsamplesizedeterminationforreplicationstudiesofgenomewideassociationstudies |