Cargando…

On multi-marker tests for association in case-control studies

Genome-wide association studies (GWAs) have identified thousands of DNA loci associated with a variety of traits. Statistical inference is almost always based on single marker hypothesis tests of association and the respective p-values with Bonferroni correction. Since commercially available genomic...

Descripción completa

Detalles Bibliográficos
Autores principales: Taub, Margaret A., Schwender, Holger R., Younkin, Samuel G., Louis, Thomas A., Ruczinski, Ingo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3863805/
https://www.ncbi.nlm.nih.gov/pubmed/24379823
http://dx.doi.org/10.3389/fgene.2013.00252
_version_ 1782295855758311424
author Taub, Margaret A.
Schwender, Holger R.
Younkin, Samuel G.
Louis, Thomas A.
Ruczinski, Ingo
author_facet Taub, Margaret A.
Schwender, Holger R.
Younkin, Samuel G.
Louis, Thomas A.
Ruczinski, Ingo
author_sort Taub, Margaret A.
collection PubMed
description Genome-wide association studies (GWAs) have identified thousands of DNA loci associated with a variety of traits. Statistical inference is almost always based on single marker hypothesis tests of association and the respective p-values with Bonferroni correction. Since commercially available genomic arrays interrogate hundreds of thousands or even millions of loci simultaneously, many causal yet undetected loci are believed to exist because the conditional power to achieve a genome-wide significance level can be low, in particular for markers with small effect sizes and low minor allele frequencies and in studies with modest sample size. However, the correlation between neighboring markers in the human genome due to linkage disequilibrium (LD) resulting in correlated marker test statistics can be incorporated into multi-marker hypothesis tests, thereby increasing power to detect association. Herein, we establish a theoretical benchmark by quantifying the maximum power achievable for multi-marker tests of association in case-control studies, achievable only when the causal marker is known. Using that genotype correlations within an LD block translate into an asymptotically multivariate normal distribution for score test statistics, we develop a set of weights for the markers that maximize the non-centrality parameter, and assess the relative loss of power for other approaches. We find that the method of Conneely and Boehnke (2007) based on the maximum absolute test statistic observed in an LD block is a practical and powerful method in a variety of settings. We also explore the effect on the power that prior biological or functional knowledge used to narrow down the locus of the causal marker can have, and conclude that this prior knowledge has to be very strong and specific for the power to approach the maximum achievable level, or even beat the power observed for methods such as the one proposed by Conneely and Boehnke (2007).
format Online
Article
Text
id pubmed-3863805
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-38638052013-12-30 On multi-marker tests for association in case-control studies Taub, Margaret A. Schwender, Holger R. Younkin, Samuel G. Louis, Thomas A. Ruczinski, Ingo Front Genet Genetics Genome-wide association studies (GWAs) have identified thousands of DNA loci associated with a variety of traits. Statistical inference is almost always based on single marker hypothesis tests of association and the respective p-values with Bonferroni correction. Since commercially available genomic arrays interrogate hundreds of thousands or even millions of loci simultaneously, many causal yet undetected loci are believed to exist because the conditional power to achieve a genome-wide significance level can be low, in particular for markers with small effect sizes and low minor allele frequencies and in studies with modest sample size. However, the correlation between neighboring markers in the human genome due to linkage disequilibrium (LD) resulting in correlated marker test statistics can be incorporated into multi-marker hypothesis tests, thereby increasing power to detect association. Herein, we establish a theoretical benchmark by quantifying the maximum power achievable for multi-marker tests of association in case-control studies, achievable only when the causal marker is known. Using that genotype correlations within an LD block translate into an asymptotically multivariate normal distribution for score test statistics, we develop a set of weights for the markers that maximize the non-centrality parameter, and assess the relative loss of power for other approaches. We find that the method of Conneely and Boehnke (2007) based on the maximum absolute test statistic observed in an LD block is a practical and powerful method in a variety of settings. We also explore the effect on the power that prior biological or functional knowledge used to narrow down the locus of the causal marker can have, and conclude that this prior knowledge has to be very strong and specific for the power to approach the maximum achievable level, or even beat the power observed for methods such as the one proposed by Conneely and Boehnke (2007). Frontiers Media S.A. 2013-12-16 /pmc/articles/PMC3863805/ /pubmed/24379823 http://dx.doi.org/10.3389/fgene.2013.00252 Text en Copyright © 2013 Taub, Schwender, Younkin, Louis and Ruczinski. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Taub, Margaret A.
Schwender, Holger R.
Younkin, Samuel G.
Louis, Thomas A.
Ruczinski, Ingo
On multi-marker tests for association in case-control studies
title On multi-marker tests for association in case-control studies
title_full On multi-marker tests for association in case-control studies
title_fullStr On multi-marker tests for association in case-control studies
title_full_unstemmed On multi-marker tests for association in case-control studies
title_short On multi-marker tests for association in case-control studies
title_sort on multi-marker tests for association in case-control studies
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3863805/
https://www.ncbi.nlm.nih.gov/pubmed/24379823
http://dx.doi.org/10.3389/fgene.2013.00252
work_keys_str_mv AT taubmargareta onmultimarkertestsforassociationincasecontrolstudies
AT schwenderholgerr onmultimarkertestsforassociationincasecontrolstudies
AT younkinsamuelg onmultimarkertestsforassociationincasecontrolstudies
AT louisthomasa onmultimarkertestsforassociationincasecontrolstudies
AT ruczinskiingo onmultimarkertestsforassociationincasecontrolstudies