Cargando…

A rare-variant test for high-dimensional data

Genome-wide association studies have facilitated the discovery of thousands of loci for hundreds of phenotypes. However, the issue of missing heritability remains unsolved for most complex traits. Locus discovery could be enhanced with both improved power through multi-phenotype analysis (MPA) and u...

Descripción completa

Detalles Bibliográficos
Autores principales: Kaakinen, Marika, Mägi, Reedik, Fischer, Krista, Heikkinen, Jani, Järvelin, Marjo-Riitta, Morris, Andrew P, Prokopenko, Inga
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5513099/
https://www.ncbi.nlm.nih.gov/pubmed/28537275
http://dx.doi.org/10.1038/ejhg.2017.90
_version_ 1783250596270178304
author Kaakinen, Marika
Mägi, Reedik
Fischer, Krista
Heikkinen, Jani
Järvelin, Marjo-Riitta
Morris, Andrew P
Prokopenko, Inga
author_facet Kaakinen, Marika
Mägi, Reedik
Fischer, Krista
Heikkinen, Jani
Järvelin, Marjo-Riitta
Morris, Andrew P
Prokopenko, Inga
author_sort Kaakinen, Marika
collection PubMed
description Genome-wide association studies have facilitated the discovery of thousands of loci for hundreds of phenotypes. However, the issue of missing heritability remains unsolved for most complex traits. Locus discovery could be enhanced with both improved power through multi-phenotype analysis (MPA) and use of a wider allele frequency range, including rare variants (RVs). MPA methods for single-variant association have been proposed, but given their low power for RVs, more efficient approaches are required. We propose multi-phenotype analysis of rare variants (MARV), a burden test-based method for RVs extended to the joint analysis of multiple phenotypes through a powerful reverse regression technique. Specifically, MARV models the proportion of RVs at which minor alleles are carried by individuals within a genomic region as a linear combination of multiple phenotypes, which can be both binary and continuous, and the method accommodates directly the genotyped and imputed data. The full model, including all phenotypes, is tested for association for discovery, and a more thorough dissection of the phenotype combinations for any set of RVs is also enabled. We show, via simulations, that the type I error rate is well controlled under various correlations between two continuous phenotypes, and that the method outperforms a univariate burden test in all considered scenarios. Application of MARV to 4876 individuals from the Northern Finland Birth Cohort 1966 for triglycerides, high- and low-density lipoprotein cholesterols highlights known loci with stronger signals of association than those observed in univariate RV analyses and suggests novel RV effects for these lipid traits.
format Online
Article
Text
id pubmed-5513099
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-55130992017-08-29 A rare-variant test for high-dimensional data Kaakinen, Marika Mägi, Reedik Fischer, Krista Heikkinen, Jani Järvelin, Marjo-Riitta Morris, Andrew P Prokopenko, Inga Eur J Hum Genet Article Genome-wide association studies have facilitated the discovery of thousands of loci for hundreds of phenotypes. However, the issue of missing heritability remains unsolved for most complex traits. Locus discovery could be enhanced with both improved power through multi-phenotype analysis (MPA) and use of a wider allele frequency range, including rare variants (RVs). MPA methods for single-variant association have been proposed, but given their low power for RVs, more efficient approaches are required. We propose multi-phenotype analysis of rare variants (MARV), a burden test-based method for RVs extended to the joint analysis of multiple phenotypes through a powerful reverse regression technique. Specifically, MARV models the proportion of RVs at which minor alleles are carried by individuals within a genomic region as a linear combination of multiple phenotypes, which can be both binary and continuous, and the method accommodates directly the genotyped and imputed data. The full model, including all phenotypes, is tested for association for discovery, and a more thorough dissection of the phenotype combinations for any set of RVs is also enabled. We show, via simulations, that the type I error rate is well controlled under various correlations between two continuous phenotypes, and that the method outperforms a univariate burden test in all considered scenarios. Application of MARV to 4876 individuals from the Northern Finland Birth Cohort 1966 for triglycerides, high- and low-density lipoprotein cholesterols highlights known loci with stronger signals of association than those observed in univariate RV analyses and suggests novel RV effects for these lipid traits. Nature Publishing Group 2017-08 2017-05-24 /pmc/articles/PMC5513099/ /pubmed/28537275 http://dx.doi.org/10.1038/ejhg.2017.90 Text en Copyright © 2017 The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Kaakinen, Marika
Mägi, Reedik
Fischer, Krista
Heikkinen, Jani
Järvelin, Marjo-Riitta
Morris, Andrew P
Prokopenko, Inga
A rare-variant test for high-dimensional data
title A rare-variant test for high-dimensional data
title_full A rare-variant test for high-dimensional data
title_fullStr A rare-variant test for high-dimensional data
title_full_unstemmed A rare-variant test for high-dimensional data
title_short A rare-variant test for high-dimensional data
title_sort rare-variant test for high-dimensional data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5513099/
https://www.ncbi.nlm.nih.gov/pubmed/28537275
http://dx.doi.org/10.1038/ejhg.2017.90
work_keys_str_mv AT kaakinenmarika ararevarianttestforhighdimensionaldata
AT magireedik ararevarianttestforhighdimensionaldata
AT fischerkrista ararevarianttestforhighdimensionaldata
AT heikkinenjani ararevarianttestforhighdimensionaldata
AT jarvelinmarjoriitta ararevarianttestforhighdimensionaldata
AT morrisandrewp ararevarianttestforhighdimensionaldata
AT prokopenkoinga ararevarianttestforhighdimensionaldata
AT kaakinenmarika rarevarianttestforhighdimensionaldata
AT magireedik rarevarianttestforhighdimensionaldata
AT fischerkrista rarevarianttestforhighdimensionaldata
AT heikkinenjani rarevarianttestforhighdimensionaldata
AT jarvelinmarjoriitta rarevarianttestforhighdimensionaldata
AT morrisandrewp rarevarianttestforhighdimensionaldata
AT prokopenkoinga rarevarianttestforhighdimensionaldata