Cargando…

Estimating the Total Number of Susceptibility Variants Underlying Complex Diseases from Genome-Wide Association Studies

Recently genome-wide association studies (GWAS) have identified numerous susceptibility variants for complex diseases. In this study we proposed several approaches to estimate the total number of variants underlying these diseases. We assume that the variance explained by genetic markers (Vg) follow...

Descripción completa

Detalles Bibliográficos
Autores principales:	So, Hon-Cheong, Yip, Benjamin H. K., Sham, Pak Chung
Formato:	Texto
Lenguaje:	English
Publicado:	Public Library of Science 2010
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2984437/ https://www.ncbi.nlm.nih.gov/pubmed/21103334 http://dx.doi.org/10.1371/journal.pone.0013898

_version_	1782192088620728320
author	So, Hon-Cheong Yip, Benjamin H. K. Sham, Pak Chung
author_facet	So, Hon-Cheong Yip, Benjamin H. K. Sham, Pak Chung
author_sort	So, Hon-Cheong
collection	PubMed
description	Recently genome-wide association studies (GWAS) have identified numerous susceptibility variants for complex diseases. In this study we proposed several approaches to estimate the total number of variants underlying these diseases. We assume that the variance explained by genetic markers (Vg) follow an exponential distribution, which is justified by previous studies on theories of adaptation. Our aim is to fit the observed distribution of Vg from GWAS to its theoretical distribution. The number of variants is obtained by the heritability divided by the estimated mean of the exponential distribution. In practice, due to limited sample sizes, there is insufficient power to detect variants with small effects. Therefore the power was taken into account in fitting. Besides considering the most significant variants, we also tried to relax the significance threshold, allowing more markers to be fitted. The effects of false positive variants were removed by considering the local false discovery rates. In addition, we developed an alternative approach by directly fitting the z-statistics from GWAS to its theoretical distribution. In all cases, the “winner's curse” effect was corrected analytically. Confidence intervals were also derived. Simulations were performed to compare and verify the performance of different estimators (which incorporates various means of winner's curse correction) and the coverage of the proposed analytic confidence intervals. Our methodology only requires summary statistics and is able to handle both binary and continuous traits. Finally we applied the methods to a few real disease examples (lipid traits, type 2 diabetes and Crohn's disease) and estimated that hundreds to nearly a thousand variants underlie these traits.
format	Text
id	pubmed-2984437
institution	National Center for Biotechnology Information
language	English
publishDate	2010
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-29844372010-11-22 Estimating the Total Number of Susceptibility Variants Underlying Complex Diseases from Genome-Wide Association Studies So, Hon-Cheong Yip, Benjamin H. K. Sham, Pak Chung PLoS One Research Article Recently genome-wide association studies (GWAS) have identified numerous susceptibility variants for complex diseases. In this study we proposed several approaches to estimate the total number of variants underlying these diseases. We assume that the variance explained by genetic markers (Vg) follow an exponential distribution, which is justified by previous studies on theories of adaptation. Our aim is to fit the observed distribution of Vg from GWAS to its theoretical distribution. The number of variants is obtained by the heritability divided by the estimated mean of the exponential distribution. In practice, due to limited sample sizes, there is insufficient power to detect variants with small effects. Therefore the power was taken into account in fitting. Besides considering the most significant variants, we also tried to relax the significance threshold, allowing more markers to be fitted. The effects of false positive variants were removed by considering the local false discovery rates. In addition, we developed an alternative approach by directly fitting the z-statistics from GWAS to its theoretical distribution. In all cases, the “winner's curse” effect was corrected analytically. Confidence intervals were also derived. Simulations were performed to compare and verify the performance of different estimators (which incorporates various means of winner's curse correction) and the coverage of the proposed analytic confidence intervals. Our methodology only requires summary statistics and is able to handle both binary and continuous traits. Finally we applied the methods to a few real disease examples (lipid traits, type 2 diabetes and Crohn's disease) and estimated that hundreds to nearly a thousand variants underlie these traits. Public Library of Science 2010-11-17 /pmc/articles/PMC2984437/ /pubmed/21103334 http://dx.doi.org/10.1371/journal.pone.0013898 Text en So et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle	Research Article So, Hon-Cheong Yip, Benjamin H. K. Sham, Pak Chung Estimating the Total Number of Susceptibility Variants Underlying Complex Diseases from Genome-Wide Association Studies
title	Estimating the Total Number of Susceptibility Variants Underlying Complex Diseases from Genome-Wide Association Studies
title_full	Estimating the Total Number of Susceptibility Variants Underlying Complex Diseases from Genome-Wide Association Studies
title_fullStr	Estimating the Total Number of Susceptibility Variants Underlying Complex Diseases from Genome-Wide Association Studies
title_full_unstemmed	Estimating the Total Number of Susceptibility Variants Underlying Complex Diseases from Genome-Wide Association Studies
title_short	Estimating the Total Number of Susceptibility Variants Underlying Complex Diseases from Genome-Wide Association Studies
title_sort	estimating the total number of susceptibility variants underlying complex diseases from genome-wide association studies
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2984437/ https://www.ncbi.nlm.nih.gov/pubmed/21103334 http://dx.doi.org/10.1371/journal.pone.0013898
work_keys_str_mv	AT sohoncheong estimatingthetotalnumberofsusceptibilityvariantsunderlyingcomplexdiseasesfromgenomewideassociationstudies AT yipbenjaminhk estimatingthetotalnumberofsusceptibilityvariantsunderlyingcomplexdiseasesfromgenomewideassociationstudies AT shampakchung estimatingthetotalnumberofsusceptibilityvariantsunderlyingcomplexdiseasesfromgenomewideassociationstudies

Estimating the Total Number of Susceptibility Variants Underlying Complex Diseases from Genome-Wide Association Studies

Ejemplares similares