Cargando…

Greater power and computational efficiency for kernel-based association testing of sets of genetic variants

Motivation: Set-based variance component tests have been identified as a way to increase power in association studies by aggregating weak individual effects. However, the choice of test statistic has been largely ignored even though it may play an important role in obtaining optimal power. We compar...

Descripción completa

Detalles Bibliográficos
Autores principales:	Lippert, Christoph, Xiang, Jing, Horta, Danilo, Widmer, Christian, Kadie, Carl, Heckerman, David, Listgarten, Jennifer
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2014
Materias:	Original Papers
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4221116/ https://www.ncbi.nlm.nih.gov/pubmed/25075117 http://dx.doi.org/10.1093/bioinformatics/btu504

_version_	1782342850383446016
author	Lippert, Christoph Xiang, Jing Horta, Danilo Widmer, Christian Kadie, Carl Heckerman, David Listgarten, Jennifer
author_facet	Lippert, Christoph Xiang, Jing Horta, Danilo Widmer, Christian Kadie, Carl Heckerman, David Listgarten, Jennifer
author_sort	Lippert, Christoph
collection	PubMed
description	Motivation: Set-based variance component tests have been identified as a way to increase power in association studies by aggregating weak individual effects. However, the choice of test statistic has been largely ignored even though it may play an important role in obtaining optimal power. We compared a standard statistical test—a score test—with a recently developed likelihood ratio (LR) test. Further, when correction for hidden structure is needed, or gene–gene interactions are sought, state-of-the art algorithms for both the score and LR tests can be computationally impractical. Thus we develop new computationally efficient methods. Results: After reviewing theoretical differences in performance between the score and LR tests, we find empirically on real data that the LR test generally has more power. In particular, on 15 of 17 real datasets, the LR test yielded at least as many associations as the score test—up to 23 more associations—whereas the score test yielded at most one more association than the LR test in the two remaining datasets. On synthetic data, we find that the LR test yielded up to 12% more associations, consistent with our results on real data, but also observe a regime of extremely small signal where the score test yielded up to 25% more associations than the LR test, consistent with theory. Finally, our computational speedups now enable (i) efficient LR testing when the background kernel is full rank, and (ii) efficient score testing when the background kernel changes with each test, as for gene–gene interaction tests. The latter yielded a factor of 2000 speedup on a cohort of size 13 500. Availability: Software available at http://research.microsoft.com/en-us/um/redmond/projects/MSCompBio/Fastlmm/. Contact: heckerma@microsoft.com Supplementary information: Supplementary data are available at Bioinformatics online.
format	Online Article Text
id	pubmed-4221116
institution	National Center for Biotechnology Information
language	English
publishDate	2014
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-42211162014-11-10 Greater power and computational efficiency for kernel-based association testing of sets of genetic variants Lippert, Christoph Xiang, Jing Horta, Danilo Widmer, Christian Kadie, Carl Heckerman, David Listgarten, Jennifer Bioinformatics Original Papers Motivation: Set-based variance component tests have been identified as a way to increase power in association studies by aggregating weak individual effects. However, the choice of test statistic has been largely ignored even though it may play an important role in obtaining optimal power. We compared a standard statistical test—a score test—with a recently developed likelihood ratio (LR) test. Further, when correction for hidden structure is needed, or gene–gene interactions are sought, state-of-the art algorithms for both the score and LR tests can be computationally impractical. Thus we develop new computationally efficient methods. Results: After reviewing theoretical differences in performance between the score and LR tests, we find empirically on real data that the LR test generally has more power. In particular, on 15 of 17 real datasets, the LR test yielded at least as many associations as the score test—up to 23 more associations—whereas the score test yielded at most one more association than the LR test in the two remaining datasets. On synthetic data, we find that the LR test yielded up to 12% more associations, consistent with our results on real data, but also observe a regime of extremely small signal where the score test yielded up to 25% more associations than the LR test, consistent with theory. Finally, our computational speedups now enable (i) efficient LR testing when the background kernel is full rank, and (ii) efficient score testing when the background kernel changes with each test, as for gene–gene interaction tests. The latter yielded a factor of 2000 speedup on a cohort of size 13 500. Availability: Software available at http://research.microsoft.com/en-us/um/redmond/projects/MSCompBio/Fastlmm/. Contact: heckerma@microsoft.com Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2014-11-15 2014-07-29 /pmc/articles/PMC4221116/ /pubmed/25075117 http://dx.doi.org/10.1093/bioinformatics/btu504 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Original Papers Lippert, Christoph Xiang, Jing Horta, Danilo Widmer, Christian Kadie, Carl Heckerman, David Listgarten, Jennifer Greater power and computational efficiency for kernel-based association testing of sets of genetic variants
title	Greater power and computational efficiency for kernel-based association testing of sets of genetic variants
title_full	Greater power and computational efficiency for kernel-based association testing of sets of genetic variants
title_fullStr	Greater power and computational efficiency for kernel-based association testing of sets of genetic variants
title_full_unstemmed	Greater power and computational efficiency for kernel-based association testing of sets of genetic variants
title_short	Greater power and computational efficiency for kernel-based association testing of sets of genetic variants
title_sort	greater power and computational efficiency for kernel-based association testing of sets of genetic variants
topic	Original Papers
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4221116/ https://www.ncbi.nlm.nih.gov/pubmed/25075117 http://dx.doi.org/10.1093/bioinformatics/btu504
work_keys_str_mv	AT lippertchristoph greaterpowerandcomputationalefficiencyforkernelbasedassociationtestingofsetsofgeneticvariants AT xiangjing greaterpowerandcomputationalefficiencyforkernelbasedassociationtestingofsetsofgeneticvariants AT hortadanilo greaterpowerandcomputationalefficiencyforkernelbasedassociationtestingofsetsofgeneticvariants AT widmerchristian greaterpowerandcomputationalefficiencyforkernelbasedassociationtestingofsetsofgeneticvariants AT kadiecarl greaterpowerandcomputationalefficiencyforkernelbasedassociationtestingofsetsofgeneticvariants AT heckermandavid greaterpowerandcomputationalefficiencyforkernelbasedassociationtestingofsetsofgeneticvariants AT listgartenjennifer greaterpowerandcomputationalefficiencyforkernelbasedassociationtestingofsetsofgeneticvariants

Greater power and computational efficiency for kernel-based association testing of sets of genetic variants

Ejemplares similares