Cargando…

Statistical Power of Model Selection Strategies for Genome-Wide Association Studies

Genome-wide association studies (GWAS) aim to identify genetic variants related to diseases by examining the associations between phenotypes and hundreds of thousands of genotyped markers. Because many genes are potentially involved in common diseases and a large number of markers are analyzed, it i...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Zheyang, Zhao, Hongyu
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2712761/
https://www.ncbi.nlm.nih.gov/pubmed/19649321
http://dx.doi.org/10.1371/journal.pgen.1000582
_version_ 1782169526815686656
author Wu, Zheyang
Zhao, Hongyu
author_facet Wu, Zheyang
Zhao, Hongyu
author_sort Wu, Zheyang
collection PubMed
description Genome-wide association studies (GWAS) aim to identify genetic variants related to diseases by examining the associations between phenotypes and hundreds of thousands of genotyped markers. Because many genes are potentially involved in common diseases and a large number of markers are analyzed, it is crucial to devise an effective strategy to identify truly associated variants that have individual and/or interactive effects, while controlling false positives at the desired level. Although a number of model selection methods have been proposed in the literature, including marginal search, exhaustive search, and forward search, their relative performance has only been evaluated through limited simulations due to the lack of an analytical approach to calculating the power of these methods. This article develops a novel statistical approach for power calculation, derives accurate formulas for the power of different model selection strategies, and then uses the formulas to evaluate and compare these strategies in genetic model spaces. In contrast to previous studies, our theoretical framework allows for random genotypes, correlations among test statistics, and a false-positive control based on GWAS practice. After the accuracy of our analytical results is validated through simulations, they are utilized to systematically evaluate and compare the performance of these strategies in a wide class of genetic models. For a specific genetic model, our results clearly reveal how different factors, such as effect size, allele frequency, and interaction, jointly affect the statistical power of each strategy. An example is provided for the application of our approach to empirical research. The statistical approach used in our derivations is general and can be employed to address the model selection problems in other random predictor settings. We have developed an R package markerSearchPower to implement our formulas, which can be downloaded from the Comprehensive R Archive Network (CRAN) or http://bioinformatics.med.yale.edu/group/.
format Text
id pubmed-2712761
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-27127612009-08-01 Statistical Power of Model Selection Strategies for Genome-Wide Association Studies Wu, Zheyang Zhao, Hongyu PLoS Genet Research Article Genome-wide association studies (GWAS) aim to identify genetic variants related to diseases by examining the associations between phenotypes and hundreds of thousands of genotyped markers. Because many genes are potentially involved in common diseases and a large number of markers are analyzed, it is crucial to devise an effective strategy to identify truly associated variants that have individual and/or interactive effects, while controlling false positives at the desired level. Although a number of model selection methods have been proposed in the literature, including marginal search, exhaustive search, and forward search, their relative performance has only been evaluated through limited simulations due to the lack of an analytical approach to calculating the power of these methods. This article develops a novel statistical approach for power calculation, derives accurate formulas for the power of different model selection strategies, and then uses the formulas to evaluate and compare these strategies in genetic model spaces. In contrast to previous studies, our theoretical framework allows for random genotypes, correlations among test statistics, and a false-positive control based on GWAS practice. After the accuracy of our analytical results is validated through simulations, they are utilized to systematically evaluate and compare the performance of these strategies in a wide class of genetic models. For a specific genetic model, our results clearly reveal how different factors, such as effect size, allele frequency, and interaction, jointly affect the statistical power of each strategy. An example is provided for the application of our approach to empirical research. The statistical approach used in our derivations is general and can be employed to address the model selection problems in other random predictor settings. We have developed an R package markerSearchPower to implement our formulas, which can be downloaded from the Comprehensive R Archive Network (CRAN) or http://bioinformatics.med.yale.edu/group/. Public Library of Science 2009-07-31 /pmc/articles/PMC2712761/ /pubmed/19649321 http://dx.doi.org/10.1371/journal.pgen.1000582 Text en Wu, Zhao. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Wu, Zheyang
Zhao, Hongyu
Statistical Power of Model Selection Strategies for Genome-Wide Association Studies
title Statistical Power of Model Selection Strategies for Genome-Wide Association Studies
title_full Statistical Power of Model Selection Strategies for Genome-Wide Association Studies
title_fullStr Statistical Power of Model Selection Strategies for Genome-Wide Association Studies
title_full_unstemmed Statistical Power of Model Selection Strategies for Genome-Wide Association Studies
title_short Statistical Power of Model Selection Strategies for Genome-Wide Association Studies
title_sort statistical power of model selection strategies for genome-wide association studies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2712761/
https://www.ncbi.nlm.nih.gov/pubmed/19649321
http://dx.doi.org/10.1371/journal.pgen.1000582
work_keys_str_mv AT wuzheyang statisticalpowerofmodelselectionstrategiesforgenomewideassociationstudies
AT zhaohongyu statisticalpowerofmodelselectionstrategiesforgenomewideassociationstudies