Cargando…
An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies
Characterizing the distribution of effects from genome-wide genotyping data is crucial for understanding important aspects of the genetic architecture of complex traits, such as number or proportion of non-null loci, average proportion of phenotypic variance explained per non-null effect, power for...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5456456/ https://www.ncbi.nlm.nih.gov/pubmed/26714184 http://dx.doi.org/10.1371/journal.pgen.1005717 |
_version_ | 1783241265601576960 |
---|---|
author | Thompson, Wesley K. Wang, Yunpeng Schork, Andrew J. Witoelar, Aree Zuber, Verena Xu, Shujing Werge, Thomas Holland, Dominic Andreassen, Ole A. Dale, Anders M. |
author_facet | Thompson, Wesley K. Wang, Yunpeng Schork, Andrew J. Witoelar, Aree Zuber, Verena Xu, Shujing Werge, Thomas Holland, Dominic Andreassen, Ole A. Dale, Anders M. |
author_sort | Thompson, Wesley K. |
collection | PubMed |
description | Characterizing the distribution of effects from genome-wide genotyping data is crucial for understanding important aspects of the genetic architecture of complex traits, such as number or proportion of non-null loci, average proportion of phenotypic variance explained per non-null effect, power for discovery, and polygenic risk prediction. To this end, previous work has used effect-size models based on various distributions, including the normal and normal mixture distributions, among others. In this paper we propose a scale mixture of two normals model for effect size distributions of genome-wide association study (GWAS) test statistics. Test statistics corresponding to null associations are modeled as random draws from a normal distribution with zero mean; test statistics corresponding to non-null associations are also modeled as normal with zero mean, but with larger variance. The model is fit via minimizing discrepancies between the parametric mixture model and resampling-based nonparametric estimates of replication effect sizes and variances. We describe in detail the implications of this model for estimation of the non-null proportion, the probability of replication in de novo samples, the local false discovery rate, and power for discovery of a specified proportion of phenotypic variance explained from additive effects of loci surpassing a given significance threshold. We also examine the crucial issue of the impact of linkage disequilibrium (LD) on effect sizes and parameter estimates, both analytically and in simulations. We apply this approach to meta-analysis test statistics from two large GWAS, one for Crohn’s disease (CD) and the other for schizophrenia (SZ). A scale mixture of two normals distribution provides an excellent fit to the SZ nonparametric replication effect size estimates. While capturing the general behavior of the data, this mixture model underestimates the tails of the CD effect size distribution. We discuss the implications of pervasive small but replicating effects in CD and SZ on genomic control and power. Finally, we conclude that, despite having very similar estimates of variance explained by genotyped SNPs, CD and SZ have a broadly dissimilar genetic architecture, due to differing mean effect size and proportion of non-null loci. |
format | Online Article Text |
id | pubmed-5456456 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-54564562017-06-05 An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies Thompson, Wesley K. Wang, Yunpeng Schork, Andrew J. Witoelar, Aree Zuber, Verena Xu, Shujing Werge, Thomas Holland, Dominic Andreassen, Ole A. Dale, Anders M. PLoS Genet Research Article Characterizing the distribution of effects from genome-wide genotyping data is crucial for understanding important aspects of the genetic architecture of complex traits, such as number or proportion of non-null loci, average proportion of phenotypic variance explained per non-null effect, power for discovery, and polygenic risk prediction. To this end, previous work has used effect-size models based on various distributions, including the normal and normal mixture distributions, among others. In this paper we propose a scale mixture of two normals model for effect size distributions of genome-wide association study (GWAS) test statistics. Test statistics corresponding to null associations are modeled as random draws from a normal distribution with zero mean; test statistics corresponding to non-null associations are also modeled as normal with zero mean, but with larger variance. The model is fit via minimizing discrepancies between the parametric mixture model and resampling-based nonparametric estimates of replication effect sizes and variances. We describe in detail the implications of this model for estimation of the non-null proportion, the probability of replication in de novo samples, the local false discovery rate, and power for discovery of a specified proportion of phenotypic variance explained from additive effects of loci surpassing a given significance threshold. We also examine the crucial issue of the impact of linkage disequilibrium (LD) on effect sizes and parameter estimates, both analytically and in simulations. We apply this approach to meta-analysis test statistics from two large GWAS, one for Crohn’s disease (CD) and the other for schizophrenia (SZ). A scale mixture of two normals distribution provides an excellent fit to the SZ nonparametric replication effect size estimates. While capturing the general behavior of the data, this mixture model underestimates the tails of the CD effect size distribution. We discuss the implications of pervasive small but replicating effects in CD and SZ on genomic control and power. Finally, we conclude that, despite having very similar estimates of variance explained by genotyped SNPs, CD and SZ have a broadly dissimilar genetic architecture, due to differing mean effect size and proportion of non-null loci. Public Library of Science 2015-12-29 /pmc/articles/PMC5456456/ /pubmed/26714184 http://dx.doi.org/10.1371/journal.pgen.1005717 Text en © 2015 Thompson et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited |
spellingShingle | Research Article Thompson, Wesley K. Wang, Yunpeng Schork, Andrew J. Witoelar, Aree Zuber, Verena Xu, Shujing Werge, Thomas Holland, Dominic Andreassen, Ole A. Dale, Anders M. An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies |
title | An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies |
title_full | An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies |
title_fullStr | An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies |
title_full_unstemmed | An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies |
title_short | An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies |
title_sort | empirical bayes mixture model for effect size distributions in genome-wide association studies |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5456456/ https://www.ncbi.nlm.nih.gov/pubmed/26714184 http://dx.doi.org/10.1371/journal.pgen.1005717 |
work_keys_str_mv | AT thompsonwesleyk anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT wangyunpeng anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT schorkandrewj anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT witoelararee anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT zuberverena anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT xushujing anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT wergethomas anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT hollanddominic anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT andreassenolea anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT daleandersm anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT thompsonwesleyk empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT wangyunpeng empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT schorkandrewj empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT witoelararee empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT zuberverena empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT xushujing empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT wergethomas empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT hollanddominic empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT andreassenolea empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies AT daleandersm empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies |