Cargando…

An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies

Characterizing the distribution of effects from genome-wide genotyping data is crucial for understanding important aspects of the genetic architecture of complex traits, such as number or proportion of non-null loci, average proportion of phenotypic variance explained per non-null effect, power for...

Descripción completa

Detalles Bibliográficos
Autores principales: Thompson, Wesley K., Wang, Yunpeng, Schork, Andrew J., Witoelar, Aree, Zuber, Verena, Xu, Shujing, Werge, Thomas, Holland, Dominic, Andreassen, Ole A., Dale, Anders M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5456456/
https://www.ncbi.nlm.nih.gov/pubmed/26714184
http://dx.doi.org/10.1371/journal.pgen.1005717
_version_ 1783241265601576960
author Thompson, Wesley K.
Wang, Yunpeng
Schork, Andrew J.
Witoelar, Aree
Zuber, Verena
Xu, Shujing
Werge, Thomas
Holland, Dominic
Andreassen, Ole A.
Dale, Anders M.
author_facet Thompson, Wesley K.
Wang, Yunpeng
Schork, Andrew J.
Witoelar, Aree
Zuber, Verena
Xu, Shujing
Werge, Thomas
Holland, Dominic
Andreassen, Ole A.
Dale, Anders M.
author_sort Thompson, Wesley K.
collection PubMed
description Characterizing the distribution of effects from genome-wide genotyping data is crucial for understanding important aspects of the genetic architecture of complex traits, such as number or proportion of non-null loci, average proportion of phenotypic variance explained per non-null effect, power for discovery, and polygenic risk prediction. To this end, previous work has used effect-size models based on various distributions, including the normal and normal mixture distributions, among others. In this paper we propose a scale mixture of two normals model for effect size distributions of genome-wide association study (GWAS) test statistics. Test statistics corresponding to null associations are modeled as random draws from a normal distribution with zero mean; test statistics corresponding to non-null associations are also modeled as normal with zero mean, but with larger variance. The model is fit via minimizing discrepancies between the parametric mixture model and resampling-based nonparametric estimates of replication effect sizes and variances. We describe in detail the implications of this model for estimation of the non-null proportion, the probability of replication in de novo samples, the local false discovery rate, and power for discovery of a specified proportion of phenotypic variance explained from additive effects of loci surpassing a given significance threshold. We also examine the crucial issue of the impact of linkage disequilibrium (LD) on effect sizes and parameter estimates, both analytically and in simulations. We apply this approach to meta-analysis test statistics from two large GWAS, one for Crohn’s disease (CD) and the other for schizophrenia (SZ). A scale mixture of two normals distribution provides an excellent fit to the SZ nonparametric replication effect size estimates. While capturing the general behavior of the data, this mixture model underestimates the tails of the CD effect size distribution. We discuss the implications of pervasive small but replicating effects in CD and SZ on genomic control and power. Finally, we conclude that, despite having very similar estimates of variance explained by genotyped SNPs, CD and SZ have a broadly dissimilar genetic architecture, due to differing mean effect size and proportion of non-null loci.
format Online
Article
Text
id pubmed-5456456
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-54564562017-06-05 An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies Thompson, Wesley K. Wang, Yunpeng Schork, Andrew J. Witoelar, Aree Zuber, Verena Xu, Shujing Werge, Thomas Holland, Dominic Andreassen, Ole A. Dale, Anders M. PLoS Genet Research Article Characterizing the distribution of effects from genome-wide genotyping data is crucial for understanding important aspects of the genetic architecture of complex traits, such as number or proportion of non-null loci, average proportion of phenotypic variance explained per non-null effect, power for discovery, and polygenic risk prediction. To this end, previous work has used effect-size models based on various distributions, including the normal and normal mixture distributions, among others. In this paper we propose a scale mixture of two normals model for effect size distributions of genome-wide association study (GWAS) test statistics. Test statistics corresponding to null associations are modeled as random draws from a normal distribution with zero mean; test statistics corresponding to non-null associations are also modeled as normal with zero mean, but with larger variance. The model is fit via minimizing discrepancies between the parametric mixture model and resampling-based nonparametric estimates of replication effect sizes and variances. We describe in detail the implications of this model for estimation of the non-null proportion, the probability of replication in de novo samples, the local false discovery rate, and power for discovery of a specified proportion of phenotypic variance explained from additive effects of loci surpassing a given significance threshold. We also examine the crucial issue of the impact of linkage disequilibrium (LD) on effect sizes and parameter estimates, both analytically and in simulations. We apply this approach to meta-analysis test statistics from two large GWAS, one for Crohn’s disease (CD) and the other for schizophrenia (SZ). A scale mixture of two normals distribution provides an excellent fit to the SZ nonparametric replication effect size estimates. While capturing the general behavior of the data, this mixture model underestimates the tails of the CD effect size distribution. We discuss the implications of pervasive small but replicating effects in CD and SZ on genomic control and power. Finally, we conclude that, despite having very similar estimates of variance explained by genotyped SNPs, CD and SZ have a broadly dissimilar genetic architecture, due to differing mean effect size and proportion of non-null loci. Public Library of Science 2015-12-29 /pmc/articles/PMC5456456/ /pubmed/26714184 http://dx.doi.org/10.1371/journal.pgen.1005717 Text en © 2015 Thompson et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
spellingShingle Research Article
Thompson, Wesley K.
Wang, Yunpeng
Schork, Andrew J.
Witoelar, Aree
Zuber, Verena
Xu, Shujing
Werge, Thomas
Holland, Dominic
Andreassen, Ole A.
Dale, Anders M.
An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies
title An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies
title_full An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies
title_fullStr An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies
title_full_unstemmed An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies
title_short An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies
title_sort empirical bayes mixture model for effect size distributions in genome-wide association studies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5456456/
https://www.ncbi.nlm.nih.gov/pubmed/26714184
http://dx.doi.org/10.1371/journal.pgen.1005717
work_keys_str_mv AT thompsonwesleyk anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT wangyunpeng anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT schorkandrewj anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT witoelararee anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT zuberverena anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT xushujing anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT wergethomas anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT hollanddominic anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT andreassenolea anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT daleandersm anempiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT thompsonwesleyk empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT wangyunpeng empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT schorkandrewj empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT witoelararee empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT zuberverena empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT xushujing empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT wergethomas empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT hollanddominic empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT andreassenolea empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies
AT daleandersm empiricalbayesmixturemodelforeffectsizedistributionsingenomewideassociationstudies