Cargando…

MultiPhen: Joint Model of Multiple Phenotypes Can Increase Discovery in GWAS

The genome-wide association study (GWAS) approach has discovered hundreds of genetic variants associated with diseases and quantitative traits. However, despite clinical overlap and statistical correlation between many phenotypes, GWAS are generally performed one-phenotype-at-a-time. Here we compare...

Descripción completa

Detalles Bibliográficos
Autores principales: O’Reilly, Paul F., Hoggart, Clive J., Pomyen, Yotsawat, Calboli, Federico C. F., Elliott, Paul, Jarvelin, Marjo-Riitta, Coin, Lachlan J. M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3342314/
https://www.ncbi.nlm.nih.gov/pubmed/22567092
http://dx.doi.org/10.1371/journal.pone.0034861
_version_ 1782231677835149312
author O’Reilly, Paul F.
Hoggart, Clive J.
Pomyen, Yotsawat
Calboli, Federico C. F.
Elliott, Paul
Jarvelin, Marjo-Riitta
Coin, Lachlan J. M.
author_facet O’Reilly, Paul F.
Hoggart, Clive J.
Pomyen, Yotsawat
Calboli, Federico C. F.
Elliott, Paul
Jarvelin, Marjo-Riitta
Coin, Lachlan J. M.
author_sort O’Reilly, Paul F.
collection PubMed
description The genome-wide association study (GWAS) approach has discovered hundreds of genetic variants associated with diseases and quantitative traits. However, despite clinical overlap and statistical correlation between many phenotypes, GWAS are generally performed one-phenotype-at-a-time. Here we compare the performance of modelling multiple phenotypes jointly with that of the standard univariate approach. We introduce a new method and software, MultiPhen, that models multiple phenotypes simultaneously in a fast and interpretable way. By performing ordinal regression, MultiPhen tests the linear combination of phenotypes most associated with the genotypes at each SNP, and thus potentially captures effects hidden to single phenotype GWAS. We demonstrate via simulation that this approach provides a dramatic increase in power in many scenarios. There is a boost in power for variants that affect multiple phenotypes and for those that affect only one phenotype. While other multivariate methods have similar power gains, we describe several benefits of MultiPhen over these. In particular, we demonstrate that other multivariate methods that assume the genotypes are normally distributed, such as canonical correlation analysis (CCA) and MANOVA, can have highly inflated type-1 error rates when testing case-control or non-normal continuous phenotypes, while MultiPhen produces no such inflation. To test the performance of MultiPhen on real data we applied it to lipid traits in the Northern Finland Birth Cohort 1966 (NFBC1966). In these data MultiPhen discovers 21% more independent SNPs with known associations than the standard univariate GWAS approach, while applying MultiPhen in addition to the standard approach provides 37% increased discovery. The most associated linear combinations of the lipids estimated by MultiPhen at the leading SNPs accurately reflect the Friedewald Formula, suggesting that MultiPhen could be used to refine the definition of existing phenotypes or uncover novel heritable phenotypes.
format Online
Article
Text
id pubmed-3342314
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-33423142012-05-07 MultiPhen: Joint Model of Multiple Phenotypes Can Increase Discovery in GWAS O’Reilly, Paul F. Hoggart, Clive J. Pomyen, Yotsawat Calboli, Federico C. F. Elliott, Paul Jarvelin, Marjo-Riitta Coin, Lachlan J. M. PLoS One Research Article The genome-wide association study (GWAS) approach has discovered hundreds of genetic variants associated with diseases and quantitative traits. However, despite clinical overlap and statistical correlation between many phenotypes, GWAS are generally performed one-phenotype-at-a-time. Here we compare the performance of modelling multiple phenotypes jointly with that of the standard univariate approach. We introduce a new method and software, MultiPhen, that models multiple phenotypes simultaneously in a fast and interpretable way. By performing ordinal regression, MultiPhen tests the linear combination of phenotypes most associated with the genotypes at each SNP, and thus potentially captures effects hidden to single phenotype GWAS. We demonstrate via simulation that this approach provides a dramatic increase in power in many scenarios. There is a boost in power for variants that affect multiple phenotypes and for those that affect only one phenotype. While other multivariate methods have similar power gains, we describe several benefits of MultiPhen over these. In particular, we demonstrate that other multivariate methods that assume the genotypes are normally distributed, such as canonical correlation analysis (CCA) and MANOVA, can have highly inflated type-1 error rates when testing case-control or non-normal continuous phenotypes, while MultiPhen produces no such inflation. To test the performance of MultiPhen on real data we applied it to lipid traits in the Northern Finland Birth Cohort 1966 (NFBC1966). In these data MultiPhen discovers 21% more independent SNPs with known associations than the standard univariate GWAS approach, while applying MultiPhen in addition to the standard approach provides 37% increased discovery. The most associated linear combinations of the lipids estimated by MultiPhen at the leading SNPs accurately reflect the Friedewald Formula, suggesting that MultiPhen could be used to refine the definition of existing phenotypes or uncover novel heritable phenotypes. Public Library of Science 2012-05-02 /pmc/articles/PMC3342314/ /pubmed/22567092 http://dx.doi.org/10.1371/journal.pone.0034861 Text en O’Reilly et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
O’Reilly, Paul F.
Hoggart, Clive J.
Pomyen, Yotsawat
Calboli, Federico C. F.
Elliott, Paul
Jarvelin, Marjo-Riitta
Coin, Lachlan J. M.
MultiPhen: Joint Model of Multiple Phenotypes Can Increase Discovery in GWAS
title MultiPhen: Joint Model of Multiple Phenotypes Can Increase Discovery in GWAS
title_full MultiPhen: Joint Model of Multiple Phenotypes Can Increase Discovery in GWAS
title_fullStr MultiPhen: Joint Model of Multiple Phenotypes Can Increase Discovery in GWAS
title_full_unstemmed MultiPhen: Joint Model of Multiple Phenotypes Can Increase Discovery in GWAS
title_short MultiPhen: Joint Model of Multiple Phenotypes Can Increase Discovery in GWAS
title_sort multiphen: joint model of multiple phenotypes can increase discovery in gwas
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3342314/
https://www.ncbi.nlm.nih.gov/pubmed/22567092
http://dx.doi.org/10.1371/journal.pone.0034861
work_keys_str_mv AT oreillypaulf multiphenjointmodelofmultiplephenotypescanincreasediscoveryingwas
AT hoggartclivej multiphenjointmodelofmultiplephenotypescanincreasediscoveryingwas
AT pomyenyotsawat multiphenjointmodelofmultiplephenotypescanincreasediscoveryingwas
AT calbolifedericocf multiphenjointmodelofmultiplephenotypescanincreasediscoveryingwas
AT elliottpaul multiphenjointmodelofmultiplephenotypescanincreasediscoveryingwas
AT jarvelinmarjoriitta multiphenjointmodelofmultiplephenotypescanincreasediscoveryingwas
AT coinlachlanjm multiphenjointmodelofmultiplephenotypescanincreasediscoveryingwas