Cargando…

Designing Genome-Wide Association Studies: Sample Size, Power, Imputation, and the Choice of Genotyping Chip

Genome-wide association studies are revolutionizing the search for the genes underlying human complex diseases. The main decisions to be made at the design stage of these studies are the choice of the commercial genotyping chip to be used and the numbers of case and control samples to be genotyped....

Descripción completa

Detalles Bibliográficos
Autores principales:	Spencer, Chris C. A., Su, Zhan, Donnelly, Peter, Marchini, Jonathan
Formato:	Texto
Lenguaje:	English
Publicado:	Public Library of Science 2009
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2688469/ https://www.ncbi.nlm.nih.gov/pubmed/19492015 http://dx.doi.org/10.1371/journal.pgen.1000477

_version_	1782167704374870016
author	Spencer, Chris C. A. Su, Zhan Donnelly, Peter Marchini, Jonathan
author_facet	Spencer, Chris C. A. Su, Zhan Donnelly, Peter Marchini, Jonathan
author_sort	Spencer, Chris C. A.
collection	PubMed
description	Genome-wide association studies are revolutionizing the search for the genes underlying human complex diseases. The main decisions to be made at the design stage of these studies are the choice of the commercial genotyping chip to be used and the numbers of case and control samples to be genotyped. The most common method of comparing different chips is using a measure of coverage, but this fails to properly account for the effects of sample size, the genetic model of the disease, and linkage disequilibrium between SNPs. In this paper, we argue that the statistical power to detect a causative variant should be the major criterion in study design. Because of the complicated pattern of linkage disequilibrium (LD) in the human genome, power cannot be calculated analytically and must instead be assessed by simulation. We describe in detail a method of simulating case-control samples at a set of linked SNPs that replicates the patterns of LD in human populations, and we used it to assess power for a comprehensive set of available genotyping chips. Our results allow us to compare the performance of the chips to detect variants with different effect sizes and allele frequencies, look at how power changes with sample size in different populations or when using multi-marker tags and genotype imputation approaches, and how performance compares to a hypothetical chip that contains every SNP in HapMap. A main conclusion of this study is that marked differences in genome coverage may not translate into appreciable differences in power and that, when taking budgetary considerations into account, the most powerful design may not always correspond to the chip with the highest coverage. We also show that genotype imputation can be used to boost the power of many chips up to the level obtained from a hypothetical “complete” chip containing all the SNPs in HapMap. Our results have been encapsulated into an R software package that allows users to design future association studies and our methods provide a framework with which new chip sets can be evaluated.
format	Text
id	pubmed-2688469
institution	National Center for Biotechnology Information
language	English
publishDate	2009
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-26884692009-06-02 Designing Genome-Wide Association Studies: Sample Size, Power, Imputation, and the Choice of Genotyping Chip Spencer, Chris C. A. Su, Zhan Donnelly, Peter Marchini, Jonathan PLoS Genet Research Article Genome-wide association studies are revolutionizing the search for the genes underlying human complex diseases. The main decisions to be made at the design stage of these studies are the choice of the commercial genotyping chip to be used and the numbers of case and control samples to be genotyped. The most common method of comparing different chips is using a measure of coverage, but this fails to properly account for the effects of sample size, the genetic model of the disease, and linkage disequilibrium between SNPs. In this paper, we argue that the statistical power to detect a causative variant should be the major criterion in study design. Because of the complicated pattern of linkage disequilibrium (LD) in the human genome, power cannot be calculated analytically and must instead be assessed by simulation. We describe in detail a method of simulating case-control samples at a set of linked SNPs that replicates the patterns of LD in human populations, and we used it to assess power for a comprehensive set of available genotyping chips. Our results allow us to compare the performance of the chips to detect variants with different effect sizes and allele frequencies, look at how power changes with sample size in different populations or when using multi-marker tags and genotype imputation approaches, and how performance compares to a hypothetical chip that contains every SNP in HapMap. A main conclusion of this study is that marked differences in genome coverage may not translate into appreciable differences in power and that, when taking budgetary considerations into account, the most powerful design may not always correspond to the chip with the highest coverage. We also show that genotype imputation can be used to boost the power of many chips up to the level obtained from a hypothetical “complete” chip containing all the SNPs in HapMap. Our results have been encapsulated into an R software package that allows users to design future association studies and our methods provide a framework with which new chip sets can be evaluated. Public Library of Science 2009-05-15 /pmc/articles/PMC2688469/ /pubmed/19492015 http://dx.doi.org/10.1371/journal.pgen.1000477 Text en Spencer et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle	Research Article Spencer, Chris C. A. Su, Zhan Donnelly, Peter Marchini, Jonathan Designing Genome-Wide Association Studies: Sample Size, Power, Imputation, and the Choice of Genotyping Chip
title	Designing Genome-Wide Association Studies: Sample Size, Power, Imputation, and the Choice of Genotyping Chip
title_full	Designing Genome-Wide Association Studies: Sample Size, Power, Imputation, and the Choice of Genotyping Chip
title_fullStr	Designing Genome-Wide Association Studies: Sample Size, Power, Imputation, and the Choice of Genotyping Chip
title_full_unstemmed	Designing Genome-Wide Association Studies: Sample Size, Power, Imputation, and the Choice of Genotyping Chip
title_short	Designing Genome-Wide Association Studies: Sample Size, Power, Imputation, and the Choice of Genotyping Chip
title_sort	designing genome-wide association studies: sample size, power, imputation, and the choice of genotyping chip
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2688469/ https://www.ncbi.nlm.nih.gov/pubmed/19492015 http://dx.doi.org/10.1371/journal.pgen.1000477
work_keys_str_mv	AT spencerchrisca designinggenomewideassociationstudiessamplesizepowerimputationandthechoiceofgenotypingchip AT suzhan designinggenomewideassociationstudiessamplesizepowerimputationandthechoiceofgenotypingchip AT donnellypeter designinggenomewideassociationstudiessamplesizepowerimputationandthechoiceofgenotypingchip AT marchinijonathan designinggenomewideassociationstudiessamplesizepowerimputationandthechoiceofgenotypingchip

Designing Genome-Wide Association Studies: Sample Size, Power, Imputation, and the Choice of Genotyping Chip

Ejemplares similares