Cargando…

Population Substructure and Control Selection in Genome-Wide Association Studies

Determination of the relevance of both demanding classical epidemiologic criteria for control selection and robust handling of population stratification (PS) represents a major challenge in the design and analysis of genome-wide association studies (GWAS). Empirical data from two GWAS in European Am...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Kai, Wang, Zhaoming, Li, Qizhai, Wacholder, Sholom, Hunter, David J., Hoover, Robert N., Chanock, Stephen, Thomas, Gilles
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2432498/
https://www.ncbi.nlm.nih.gov/pubmed/18596976
http://dx.doi.org/10.1371/journal.pone.0002551
_version_ 1782156460095963136
author Yu, Kai
Wang, Zhaoming
Li, Qizhai
Wacholder, Sholom
Hunter, David J.
Hoover, Robert N.
Chanock, Stephen
Thomas, Gilles
author_facet Yu, Kai
Wang, Zhaoming
Li, Qizhai
Wacholder, Sholom
Hunter, David J.
Hoover, Robert N.
Chanock, Stephen
Thomas, Gilles
author_sort Yu, Kai
collection PubMed
description Determination of the relevance of both demanding classical epidemiologic criteria for control selection and robust handling of population stratification (PS) represents a major challenge in the design and analysis of genome-wide association studies (GWAS). Empirical data from two GWAS in European Americans of the Cancer Genetic Markers of Susceptibility (CGEMS) project were used to evaluate the impact of PS in studies with different control selection strategies. In each of the two original case-control studies nested in corresponding prospective cohorts, a minor confounding effect due to PS (inflation factor λ of 1.025 and 1.005) was observed. In contrast, when the control groups were exchanged to mimic a cost-effective but theoretically less desirable control selection strategy, the confounding effects were larger (λ of 1.090 and 1.062). A panel of 12,898 autosomal SNPs common to both the Illumina and Affymetrix commercial platforms and with low local background linkage disequilibrium (pair-wise r (2)<0.004) was selected to infer population substructure with principal component analysis. A novel permutation procedure was developed for the correction of PS that identified a smaller set of principal components and achieved a better control of type I error (to λ of 1.032 and 1.006, respectively) than currently used methods. The overlap between sets of SNPs in the bottom 5% of p-values based on the new test and the test without PS correction was about 80%, with the majority of discordant SNPs having both ranks close to the threshold. Thus, for the CGEMS GWAS of prostate and breast cancer conducted in European Americans, PS does not appear to be a major problem in well-designed studies. A study using suboptimal controls can have acceptable type I error when an effective strategy for the correction of PS is employed.
format Text
id pubmed-2432498
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-24324982008-07-02 Population Substructure and Control Selection in Genome-Wide Association Studies Yu, Kai Wang, Zhaoming Li, Qizhai Wacholder, Sholom Hunter, David J. Hoover, Robert N. Chanock, Stephen Thomas, Gilles PLoS One Research Article Determination of the relevance of both demanding classical epidemiologic criteria for control selection and robust handling of population stratification (PS) represents a major challenge in the design and analysis of genome-wide association studies (GWAS). Empirical data from two GWAS in European Americans of the Cancer Genetic Markers of Susceptibility (CGEMS) project were used to evaluate the impact of PS in studies with different control selection strategies. In each of the two original case-control studies nested in corresponding prospective cohorts, a minor confounding effect due to PS (inflation factor λ of 1.025 and 1.005) was observed. In contrast, when the control groups were exchanged to mimic a cost-effective but theoretically less desirable control selection strategy, the confounding effects were larger (λ of 1.090 and 1.062). A panel of 12,898 autosomal SNPs common to both the Illumina and Affymetrix commercial platforms and with low local background linkage disequilibrium (pair-wise r (2)<0.004) was selected to infer population substructure with principal component analysis. A novel permutation procedure was developed for the correction of PS that identified a smaller set of principal components and achieved a better control of type I error (to λ of 1.032 and 1.006, respectively) than currently used methods. The overlap between sets of SNPs in the bottom 5% of p-values based on the new test and the test without PS correction was about 80%, with the majority of discordant SNPs having both ranks close to the threshold. Thus, for the CGEMS GWAS of prostate and breast cancer conducted in European Americans, PS does not appear to be a major problem in well-designed studies. A study using suboptimal controls can have acceptable type I error when an effective strategy for the correction of PS is employed. Public Library of Science 2008-07-02 /pmc/articles/PMC2432498/ /pubmed/18596976 http://dx.doi.org/10.1371/journal.pone.0002551 Text en This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. https://creativecommons.org/publicdomain/zero/1.0/ This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration, which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose.
spellingShingle Research Article
Yu, Kai
Wang, Zhaoming
Li, Qizhai
Wacholder, Sholom
Hunter, David J.
Hoover, Robert N.
Chanock, Stephen
Thomas, Gilles
Population Substructure and Control Selection in Genome-Wide Association Studies
title Population Substructure and Control Selection in Genome-Wide Association Studies
title_full Population Substructure and Control Selection in Genome-Wide Association Studies
title_fullStr Population Substructure and Control Selection in Genome-Wide Association Studies
title_full_unstemmed Population Substructure and Control Selection in Genome-Wide Association Studies
title_short Population Substructure and Control Selection in Genome-Wide Association Studies
title_sort population substructure and control selection in genome-wide association studies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2432498/
https://www.ncbi.nlm.nih.gov/pubmed/18596976
http://dx.doi.org/10.1371/journal.pone.0002551
work_keys_str_mv AT yukai populationsubstructureandcontrolselectioningenomewideassociationstudies
AT wangzhaoming populationsubstructureandcontrolselectioningenomewideassociationstudies
AT liqizhai populationsubstructureandcontrolselectioningenomewideassociationstudies
AT wacholdersholom populationsubstructureandcontrolselectioningenomewideassociationstudies
AT hunterdavidj populationsubstructureandcontrolselectioningenomewideassociationstudies
AT hooverrobertn populationsubstructureandcontrolselectioningenomewideassociationstudies
AT chanockstephen populationsubstructureandcontrolselectioningenomewideassociationstudies
AT thomasgilles populationsubstructureandcontrolselectioningenomewideassociationstudies