Cargando…

Power estimation and sample size determination for replication studies of genome-wide association studies

BACKGROUND: Replication study is a commonly used verification method to filter out false positives in genome-wide association studies (GWAS). If an association can be confirmed in a replication study, it will have a high confidence to be true positive. To design a replication study, traditional appr...

Descripción completa

Detalles Bibliográficos
Autores principales:	Jiang, Wei, Yu, Weichuan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2016
Materias:	Methodology
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4895704/ https://www.ncbi.nlm.nih.gov/pubmed/26818952 http://dx.doi.org/10.1186/s12864-015-2296-4

_version_	1782435905214087168
author	Jiang, Wei Yu, Weichuan
author_facet	Jiang, Wei Yu, Weichuan
author_sort	Jiang, Wei
collection	PubMed
description	BACKGROUND: Replication study is a commonly used verification method to filter out false positives in genome-wide association studies (GWAS). If an association can be confirmed in a replication study, it will have a high confidence to be true positive. To design a replication study, traditional approaches calculate power by treating replication study as another independent primary study. These approaches do not use the information given by primary study. Besides, they need to specify a minimum detectable effect size, which may be subjective. One may think to replace the minimum effect size with the observed effect sizes in the power calculation. However, this approach will make the designed replication study underpowered since we are only interested in the positive associations from the primary study and the problem of the “winner’s curse” will occur. RESULTS: An Empirical Bayes (EB) based method is proposed to estimate the power of replication study for each association. The corresponding credible interval is estimated in the proposed approach. Simulation experiments show that our method is better than other plug-in based estimators in terms of overcoming the winner’s curse and providing higher estimation accuracy. The coverage probability of given credible interval is well-calibrated in the simulation experiments. Weighted average method is used to estimate the average power of all underlying true associations. This is used to determine the sample size of replication study. Sample sizes are estimated on 6 diseases from Wellcome Trust Case Control Consortium (WTCCC) using our method. They are higher than sample sizes estimated by plugging observed effect sizes in power calculation. CONCLUSIONS: Our new method can objectively determine replication study’s sample size by using information extracted from primary study. Also the winner’s curse is alleviated. Thus, it is a better choice when designing replication studies of GWAS. The R-package is available at: http://bioinformatics.ust.hk/RPower.html.
format	Online Article Text
id	pubmed-4895704
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-48957042016-06-10 Power estimation and sample size determination for replication studies of genome-wide association studies Jiang, Wei Yu, Weichuan BMC Genomics Methodology BACKGROUND: Replication study is a commonly used verification method to filter out false positives in genome-wide association studies (GWAS). If an association can be confirmed in a replication study, it will have a high confidence to be true positive. To design a replication study, traditional approaches calculate power by treating replication study as another independent primary study. These approaches do not use the information given by primary study. Besides, they need to specify a minimum detectable effect size, which may be subjective. One may think to replace the minimum effect size with the observed effect sizes in the power calculation. However, this approach will make the designed replication study underpowered since we are only interested in the positive associations from the primary study and the problem of the “winner’s curse” will occur. RESULTS: An Empirical Bayes (EB) based method is proposed to estimate the power of replication study for each association. The corresponding credible interval is estimated in the proposed approach. Simulation experiments show that our method is better than other plug-in based estimators in terms of overcoming the winner’s curse and providing higher estimation accuracy. The coverage probability of given credible interval is well-calibrated in the simulation experiments. Weighted average method is used to estimate the average power of all underlying true associations. This is used to determine the sample size of replication study. Sample sizes are estimated on 6 diseases from Wellcome Trust Case Control Consortium (WTCCC) using our method. They are higher than sample sizes estimated by plugging observed effect sizes in power calculation. CONCLUSIONS: Our new method can objectively determine replication study’s sample size by using information extracted from primary study. Also the winner’s curse is alleviated. Thus, it is a better choice when designing replication studies of GWAS. The R-package is available at: http://bioinformatics.ust.hk/RPower.html. BioMed Central 2016-01-11 /pmc/articles/PMC4895704/ /pubmed/26818952 http://dx.doi.org/10.1186/s12864-015-2296-4 Text en © Jiang and Yu. 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Methodology Jiang, Wei Yu, Weichuan Power estimation and sample size determination for replication studies of genome-wide association studies
title	Power estimation and sample size determination for replication studies of genome-wide association studies
title_full	Power estimation and sample size determination for replication studies of genome-wide association studies
title_fullStr	Power estimation and sample size determination for replication studies of genome-wide association studies
title_full_unstemmed	Power estimation and sample size determination for replication studies of genome-wide association studies
title_short	Power estimation and sample size determination for replication studies of genome-wide association studies
title_sort	power estimation and sample size determination for replication studies of genome-wide association studies
topic	Methodology
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4895704/ https://www.ncbi.nlm.nih.gov/pubmed/26818952 http://dx.doi.org/10.1186/s12864-015-2296-4
work_keys_str_mv	AT jiangwei powerestimationandsamplesizedeterminationforreplicationstudiesofgenomewideassociationstudies AT yuweichuan powerestimationandsamplesizedeterminationforreplicationstudiesofgenomewideassociationstudies

Power estimation and sample size determination for replication studies of genome-wide association studies

Ejemplares similares