Cargando…

Mixture SNPs effect on phenotype in genome-wide association studies

BACKGROUND: Recently mixed linear models are used to address the issue of “missing" heritability in traditional Genome-wide association studies (GWAS). The models assume that all single-nucleotide polymorphisms (SNPs) are associated with the phenotypes of interest. However, it is more common th...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Ling, Shen, Haipeng, Liu, Hexuan, Guo, Guang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4417323/
https://www.ncbi.nlm.nih.gov/pubmed/25649116
http://dx.doi.org/10.1186/1471-2164-16-3
_version_ 1782369357434716160
author Wang, Ling
Shen, Haipeng
Liu, Hexuan
Guo, Guang
author_facet Wang, Ling
Shen, Haipeng
Liu, Hexuan
Guo, Guang
author_sort Wang, Ling
collection PubMed
description BACKGROUND: Recently mixed linear models are used to address the issue of “missing" heritability in traditional Genome-wide association studies (GWAS). The models assume that all single-nucleotide polymorphisms (SNPs) are associated with the phenotypes of interest. However, it is more common that only a small proportion of SNPs have significant effects on the phenotypes, while most SNPs have no or very small effects. To incorporate this feature, we propose an efficient Hierarchical Bayesian Model (HBM) that extends the existing mixed models to enforce automatic selection of significant SNPs. The HBM models the SNP effects using a mixture distribution of a point mass at zero and a normal distribution, where the point mass corresponds to those non-associative SNPs. RESULTS: We estimate the HBM using Gibbs sampling. The estimation performance of our method is first demonstrated through two simulation studies. We make the simulation setups realistic by using parameters fitted on the Framingham Heart Study (FHS) data. The simulation studies show that our method can accurately estimate the proportion of SNPs associated with the simulated phenotype and identify these SNPs, as well as adapt to certain model mis-specification than the standard mixed models. In addition, we analyze data from the FHS and the Health and Retirement Study (HRS) to study the association between Body Mass Index (BMI) and SNPs on Chromosome 16, and replicate the identified genetic associations. The analysis of the FHS data identifies 0.3% SNPs on Chromosome 16 that affect BMI, including rs9939609 and rs9939973 on the FTO gene. These two SNPs are in strong linkage disequilibrium with rs1558902 (Rsq =0.901 for rs9939609 and Rsq =0.905 for rs9939973), which has been reported to be linked with obesity in previous GWAS. We then replicate the findings using the HRS data: the analysis finds 0.4% of SNPs associated with BMI on Chromosome 16. Furthermore, around 25% of the genes that are identified to be associated with BMI are common between the two studies. CONCLUSIONS: The results demonstrate that the HBM and the associated estimation algorithm offer a powerful tool for identifying significant genetic associations with phenotypes of interest, among a large number of SNPs that are common in modern genetics studies.
format Online
Article
Text
id pubmed-4417323
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44173232015-05-03 Mixture SNPs effect on phenotype in genome-wide association studies Wang, Ling Shen, Haipeng Liu, Hexuan Guo, Guang BMC Genomics Methodology Article BACKGROUND: Recently mixed linear models are used to address the issue of “missing" heritability in traditional Genome-wide association studies (GWAS). The models assume that all single-nucleotide polymorphisms (SNPs) are associated with the phenotypes of interest. However, it is more common that only a small proportion of SNPs have significant effects on the phenotypes, while most SNPs have no or very small effects. To incorporate this feature, we propose an efficient Hierarchical Bayesian Model (HBM) that extends the existing mixed models to enforce automatic selection of significant SNPs. The HBM models the SNP effects using a mixture distribution of a point mass at zero and a normal distribution, where the point mass corresponds to those non-associative SNPs. RESULTS: We estimate the HBM using Gibbs sampling. The estimation performance of our method is first demonstrated through two simulation studies. We make the simulation setups realistic by using parameters fitted on the Framingham Heart Study (FHS) data. The simulation studies show that our method can accurately estimate the proportion of SNPs associated with the simulated phenotype and identify these SNPs, as well as adapt to certain model mis-specification than the standard mixed models. In addition, we analyze data from the FHS and the Health and Retirement Study (HRS) to study the association between Body Mass Index (BMI) and SNPs on Chromosome 16, and replicate the identified genetic associations. The analysis of the FHS data identifies 0.3% SNPs on Chromosome 16 that affect BMI, including rs9939609 and rs9939973 on the FTO gene. These two SNPs are in strong linkage disequilibrium with rs1558902 (Rsq =0.901 for rs9939609 and Rsq =0.905 for rs9939973), which has been reported to be linked with obesity in previous GWAS. We then replicate the findings using the HRS data: the analysis finds 0.4% of SNPs associated with BMI on Chromosome 16. Furthermore, around 25% of the genes that are identified to be associated with BMI are common between the two studies. CONCLUSIONS: The results demonstrate that the HBM and the associated estimation algorithm offer a powerful tool for identifying significant genetic associations with phenotypes of interest, among a large number of SNPs that are common in modern genetics studies. BioMed Central 2015-02-03 /pmc/articles/PMC4417323/ /pubmed/25649116 http://dx.doi.org/10.1186/1471-2164-16-3 Text en © Wang et al.; licensee BioMed Central. 2015 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Wang, Ling
Shen, Haipeng
Liu, Hexuan
Guo, Guang
Mixture SNPs effect on phenotype in genome-wide association studies
title Mixture SNPs effect on phenotype in genome-wide association studies
title_full Mixture SNPs effect on phenotype in genome-wide association studies
title_fullStr Mixture SNPs effect on phenotype in genome-wide association studies
title_full_unstemmed Mixture SNPs effect on phenotype in genome-wide association studies
title_short Mixture SNPs effect on phenotype in genome-wide association studies
title_sort mixture snps effect on phenotype in genome-wide association studies
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4417323/
https://www.ncbi.nlm.nih.gov/pubmed/25649116
http://dx.doi.org/10.1186/1471-2164-16-3
work_keys_str_mv AT wangling mixturesnpseffectonphenotypeingenomewideassociationstudies
AT shenhaipeng mixturesnpseffectonphenotypeingenomewideassociationstudies
AT liuhexuan mixturesnpseffectonphenotypeingenomewideassociationstudies
AT guoguang mixturesnpseffectonphenotypeingenomewideassociationstudies