Cargando…
PhenoSpD: an integrated toolkit for phenotypic correlation estimation and multiple testing correction using GWAS summary statistics
BACKGROUND: Identifying phenotypic correlations between complex traits and diseases can provide useful etiological insights. Restricted access to much individual-level phenotype data makes it difficult to estimate large-scale phenotypic correlation across the human phenome. Two state-of-the-art meth...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6109640/ https://www.ncbi.nlm.nih.gov/pubmed/30165448 http://dx.doi.org/10.1093/gigascience/giy090 |
_version_ | 1783350358369632256 |
---|---|
author | Zheng, Jie Richardson, Tom G Millard, Louise A C Hemani, Gibran Elsworth, Benjamin L Raistrick, Christopher A Vilhjalmsson, Bjarni Neale, Benjamin M Haycock, Philip C Smith, George Davey Gaunt, Tom R |
author_facet | Zheng, Jie Richardson, Tom G Millard, Louise A C Hemani, Gibran Elsworth, Benjamin L Raistrick, Christopher A Vilhjalmsson, Bjarni Neale, Benjamin M Haycock, Philip C Smith, George Davey Gaunt, Tom R |
author_sort | Zheng, Jie |
collection | PubMed |
description | BACKGROUND: Identifying phenotypic correlations between complex traits and diseases can provide useful etiological insights. Restricted access to much individual-level phenotype data makes it difficult to estimate large-scale phenotypic correlation across the human phenome. Two state-of-the-art methods, metaCCA and LD score regression, provide an alternative approach to estimate phenotypic correlation using only genome-wide association study (GWAS) summary results. RESULTS: Here, we present an integrated R toolkit, PhenoSpD, to use LD score regression to estimate phenotypic correlations using GWAS summary statistics and to utilize the estimated phenotypic correlations to inform correction of multiple testing for complex human traits using the spectral decomposition of matrices (SpD). The simulations suggest that it is possible to identify nonindependence of phenotypes using samples with partial overlap; as overlap decreases, the estimated phenotypic correlations will attenuate toward zero and multiple testing correction will be more stringent than in perfectly overlapping samples. Also, in contrast to LD score regression, metaCCA will provide approximate genetic correlations rather than phenotypic correlation, which limits its application for multiple testing correction. In a case study, PhenoSpD using UK Biobank GWAS results suggested 399.6 independent tests among 487 human traits, which is close to the 352.4 independent tests estimated using true phenotypic correlation. We further applied PhenoSpD to an estimated 5,618 pair-wise phenotypic correlations among 107 metabolites using GWAS summary statistics from Kettunen's publication and PhenoSpD suggested the equivalent of 33.5 independent tests for these metabolites. CONCLUSIONS: PhenoSpD extends the use of summary-level results, providing a simple and conservative way to reduce dimensionality for complex human traits using GWAS summary statistics. This is particularly valuable in the age of large-scale biobank and consortia studies, where GWAS results are much more accessible than individual-level data. |
format | Online Article Text |
id | pubmed-6109640 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-61096402018-08-30 PhenoSpD: an integrated toolkit for phenotypic correlation estimation and multiple testing correction using GWAS summary statistics Zheng, Jie Richardson, Tom G Millard, Louise A C Hemani, Gibran Elsworth, Benjamin L Raistrick, Christopher A Vilhjalmsson, Bjarni Neale, Benjamin M Haycock, Philip C Smith, George Davey Gaunt, Tom R Gigascience Technical Note BACKGROUND: Identifying phenotypic correlations between complex traits and diseases can provide useful etiological insights. Restricted access to much individual-level phenotype data makes it difficult to estimate large-scale phenotypic correlation across the human phenome. Two state-of-the-art methods, metaCCA and LD score regression, provide an alternative approach to estimate phenotypic correlation using only genome-wide association study (GWAS) summary results. RESULTS: Here, we present an integrated R toolkit, PhenoSpD, to use LD score regression to estimate phenotypic correlations using GWAS summary statistics and to utilize the estimated phenotypic correlations to inform correction of multiple testing for complex human traits using the spectral decomposition of matrices (SpD). The simulations suggest that it is possible to identify nonindependence of phenotypes using samples with partial overlap; as overlap decreases, the estimated phenotypic correlations will attenuate toward zero and multiple testing correction will be more stringent than in perfectly overlapping samples. Also, in contrast to LD score regression, metaCCA will provide approximate genetic correlations rather than phenotypic correlation, which limits its application for multiple testing correction. In a case study, PhenoSpD using UK Biobank GWAS results suggested 399.6 independent tests among 487 human traits, which is close to the 352.4 independent tests estimated using true phenotypic correlation. We further applied PhenoSpD to an estimated 5,618 pair-wise phenotypic correlations among 107 metabolites using GWAS summary statistics from Kettunen's publication and PhenoSpD suggested the equivalent of 33.5 independent tests for these metabolites. CONCLUSIONS: PhenoSpD extends the use of summary-level results, providing a simple and conservative way to reduce dimensionality for complex human traits using GWAS summary statistics. This is particularly valuable in the age of large-scale biobank and consortia studies, where GWAS results are much more accessible than individual-level data. Oxford University Press 2018-08-24 /pmc/articles/PMC6109640/ /pubmed/30165448 http://dx.doi.org/10.1093/gigascience/giy090 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Technical Note Zheng, Jie Richardson, Tom G Millard, Louise A C Hemani, Gibran Elsworth, Benjamin L Raistrick, Christopher A Vilhjalmsson, Bjarni Neale, Benjamin M Haycock, Philip C Smith, George Davey Gaunt, Tom R PhenoSpD: an integrated toolkit for phenotypic correlation estimation and multiple testing correction using GWAS summary statistics |
title | PhenoSpD: an integrated toolkit for phenotypic correlation estimation and multiple testing correction using GWAS summary statistics |
title_full | PhenoSpD: an integrated toolkit for phenotypic correlation estimation and multiple testing correction using GWAS summary statistics |
title_fullStr | PhenoSpD: an integrated toolkit for phenotypic correlation estimation and multiple testing correction using GWAS summary statistics |
title_full_unstemmed | PhenoSpD: an integrated toolkit for phenotypic correlation estimation and multiple testing correction using GWAS summary statistics |
title_short | PhenoSpD: an integrated toolkit for phenotypic correlation estimation and multiple testing correction using GWAS summary statistics |
title_sort | phenospd: an integrated toolkit for phenotypic correlation estimation and multiple testing correction using gwas summary statistics |
topic | Technical Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6109640/ https://www.ncbi.nlm.nih.gov/pubmed/30165448 http://dx.doi.org/10.1093/gigascience/giy090 |
work_keys_str_mv | AT zhengjie phenospdanintegratedtoolkitforphenotypiccorrelationestimationandmultipletestingcorrectionusinggwassummarystatistics AT richardsontomg phenospdanintegratedtoolkitforphenotypiccorrelationestimationandmultipletestingcorrectionusinggwassummarystatistics AT millardlouiseac phenospdanintegratedtoolkitforphenotypiccorrelationestimationandmultipletestingcorrectionusinggwassummarystatistics AT hemanigibran phenospdanintegratedtoolkitforphenotypiccorrelationestimationandmultipletestingcorrectionusinggwassummarystatistics AT elsworthbenjaminl phenospdanintegratedtoolkitforphenotypiccorrelationestimationandmultipletestingcorrectionusinggwassummarystatistics AT raistrickchristophera phenospdanintegratedtoolkitforphenotypiccorrelationestimationandmultipletestingcorrectionusinggwassummarystatistics AT vilhjalmssonbjarni phenospdanintegratedtoolkitforphenotypiccorrelationestimationandmultipletestingcorrectionusinggwassummarystatistics AT nealebenjaminm phenospdanintegratedtoolkitforphenotypiccorrelationestimationandmultipletestingcorrectionusinggwassummarystatistics AT haycockphilipc phenospdanintegratedtoolkitforphenotypiccorrelationestimationandmultipletestingcorrectionusinggwassummarystatistics AT smithgeorgedavey phenospdanintegratedtoolkitforphenotypiccorrelationestimationandmultipletestingcorrectionusinggwassummarystatistics AT gaunttomr phenospdanintegratedtoolkitforphenotypiccorrelationestimationandmultipletestingcorrectionusinggwassummarystatistics |