Cargando…

Power estimation and sample size determination for replication studies of genome-wide association studies

BACKGROUND: Replication study is a commonly used verification method to filter out false positives in genome-wide association studies (GWAS). If an association can be confirmed in a replication study, it will have a high confidence to be true positive. To design a replication study, traditional appr...

Descripción completa

Detalles Bibliográficos
Autores principales: Jiang, Wei, Yu, Weichuan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4895704/
https://www.ncbi.nlm.nih.gov/pubmed/26818952
http://dx.doi.org/10.1186/s12864-015-2296-4
_version_ 1782435905214087168
author Jiang, Wei
Yu, Weichuan
author_facet Jiang, Wei
Yu, Weichuan
author_sort Jiang, Wei
collection PubMed
description BACKGROUND: Replication study is a commonly used verification method to filter out false positives in genome-wide association studies (GWAS). If an association can be confirmed in a replication study, it will have a high confidence to be true positive. To design a replication study, traditional approaches calculate power by treating replication study as another independent primary study. These approaches do not use the information given by primary study. Besides, they need to specify a minimum detectable effect size, which may be subjective. One may think to replace the minimum effect size with the observed effect sizes in the power calculation. However, this approach will make the designed replication study underpowered since we are only interested in the positive associations from the primary study and the problem of the “winner’s curse” will occur. RESULTS: An Empirical Bayes (EB) based method is proposed to estimate the power of replication study for each association. The corresponding credible interval is estimated in the proposed approach. Simulation experiments show that our method is better than other plug-in based estimators in terms of overcoming the winner’s curse and providing higher estimation accuracy. The coverage probability of given credible interval is well-calibrated in the simulation experiments. Weighted average method is used to estimate the average power of all underlying true associations. This is used to determine the sample size of replication study. Sample sizes are estimated on 6 diseases from Wellcome Trust Case Control Consortium (WTCCC) using our method. They are higher than sample sizes estimated by plugging observed effect sizes in power calculation. CONCLUSIONS: Our new method can objectively determine replication study’s sample size by using information extracted from primary study. Also the winner’s curse is alleviated. Thus, it is a better choice when designing replication studies of GWAS. The R-package is available at: http://bioinformatics.ust.hk/RPower.html.
format Online
Article
Text
id pubmed-4895704
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-48957042016-06-10 Power estimation and sample size determination for replication studies of genome-wide association studies Jiang, Wei Yu, Weichuan BMC Genomics Methodology BACKGROUND: Replication study is a commonly used verification method to filter out false positives in genome-wide association studies (GWAS). If an association can be confirmed in a replication study, it will have a high confidence to be true positive. To design a replication study, traditional approaches calculate power by treating replication study as another independent primary study. These approaches do not use the information given by primary study. Besides, they need to specify a minimum detectable effect size, which may be subjective. One may think to replace the minimum effect size with the observed effect sizes in the power calculation. However, this approach will make the designed replication study underpowered since we are only interested in the positive associations from the primary study and the problem of the “winner’s curse” will occur. RESULTS: An Empirical Bayes (EB) based method is proposed to estimate the power of replication study for each association. The corresponding credible interval is estimated in the proposed approach. Simulation experiments show that our method is better than other plug-in based estimators in terms of overcoming the winner’s curse and providing higher estimation accuracy. The coverage probability of given credible interval is well-calibrated in the simulation experiments. Weighted average method is used to estimate the average power of all underlying true associations. This is used to determine the sample size of replication study. Sample sizes are estimated on 6 diseases from Wellcome Trust Case Control Consortium (WTCCC) using our method. They are higher than sample sizes estimated by plugging observed effect sizes in power calculation. CONCLUSIONS: Our new method can objectively determine replication study’s sample size by using information extracted from primary study. Also the winner’s curse is alleviated. Thus, it is a better choice when designing replication studies of GWAS. The R-package is available at: http://bioinformatics.ust.hk/RPower.html. BioMed Central 2016-01-11 /pmc/articles/PMC4895704/ /pubmed/26818952 http://dx.doi.org/10.1186/s12864-015-2296-4 Text en © Jiang and Yu. 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Jiang, Wei
Yu, Weichuan
Power estimation and sample size determination for replication studies of genome-wide association studies
title Power estimation and sample size determination for replication studies of genome-wide association studies
title_full Power estimation and sample size determination for replication studies of genome-wide association studies
title_fullStr Power estimation and sample size determination for replication studies of genome-wide association studies
title_full_unstemmed Power estimation and sample size determination for replication studies of genome-wide association studies
title_short Power estimation and sample size determination for replication studies of genome-wide association studies
title_sort power estimation and sample size determination for replication studies of genome-wide association studies
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4895704/
https://www.ncbi.nlm.nih.gov/pubmed/26818952
http://dx.doi.org/10.1186/s12864-015-2296-4
work_keys_str_mv AT jiangwei powerestimationandsamplesizedeterminationforreplicationstudiesofgenomewideassociationstudies
AT yuweichuan powerestimationandsamplesizedeterminationforreplicationstudiesofgenomewideassociationstudies