Cargando…
Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies
Transcriptome-wide association studies (TWAS) test the association between traits and genetically predicted gene expression levels. The power of a TWAS depends in part on the strength of the correlation between a genetic predictor of gene expression and the causally relevant gene expression values....
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8057593/ https://www.ncbi.nlm.nih.gov/pubmed/33831007 http://dx.doi.org/10.1371/journal.pgen.1008973 |
_version_ | 1783680867664658432 |
---|---|
author | Feng, Helian Mancuso, Nicholas Gusev, Alexander Majumdar, Arunabha Major, Megan Pasaniuc, Bogdan Kraft, Peter |
author_facet | Feng, Helian Mancuso, Nicholas Gusev, Alexander Majumdar, Arunabha Major, Megan Pasaniuc, Bogdan Kraft, Peter |
author_sort | Feng, Helian |
collection | PubMed |
description | Transcriptome-wide association studies (TWAS) test the association between traits and genetically predicted gene expression levels. The power of a TWAS depends in part on the strength of the correlation between a genetic predictor of gene expression and the causally relevant gene expression values. Consequently, TWAS power can be low when expression quantitative trait locus (eQTL) data used to train the genetic predictors have small sample sizes, or when data from causally relevant tissues are not available. Here, we propose to address these issues by integrating multiple tissues in the TWAS using sparse canonical correlation analysis (sCCA). We show that sCCA-TWAS combined with single-tissue TWAS using an aggregate Cauchy association test (ACAT) outperforms traditional single-tissue TWAS. In empirically motivated simulations, the sCCA+ACAT approach yielded the highest power to detect a gene associated with phenotype, even when expression in the causal tissue was not directly measured, while controlling the Type I error when there is no association between gene expression and phenotype. For example, when gene expression explains 2% of the variability in outcome, and the GWAS sample size is 20,000, the average power difference between the ACAT combined test of sCCA features and single-tissue, versus single-tissue combined with Generalized Berk-Jones (GBJ) method, single-tissue combined with S-MultiXcan, UTMOST, or summarizing cross-tissue expression patterns using Principal Component Analysis (PCA) approaches was 5%, 8%, 5% and 38%, respectively. The gain in power is likely due to sCCA cross-tissue features being more likely to be detectably heritable. When applied to publicly available summary statistics from 10 complex traits, the sCCA+ACAT test was able to increase the number of testable genes and identify on average an additional 400 additional gene-trait associations that single-trait TWAS missed. Our results suggest that aggregating eQTL data across multiple tissues using sCCA can improve the sensitivity of TWAS while controlling for the false positive rate. |
format | Online Article Text |
id | pubmed-8057593 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-80575932021-05-04 Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies Feng, Helian Mancuso, Nicholas Gusev, Alexander Majumdar, Arunabha Major, Megan Pasaniuc, Bogdan Kraft, Peter PLoS Genet Research Article Transcriptome-wide association studies (TWAS) test the association between traits and genetically predicted gene expression levels. The power of a TWAS depends in part on the strength of the correlation between a genetic predictor of gene expression and the causally relevant gene expression values. Consequently, TWAS power can be low when expression quantitative trait locus (eQTL) data used to train the genetic predictors have small sample sizes, or when data from causally relevant tissues are not available. Here, we propose to address these issues by integrating multiple tissues in the TWAS using sparse canonical correlation analysis (sCCA). We show that sCCA-TWAS combined with single-tissue TWAS using an aggregate Cauchy association test (ACAT) outperforms traditional single-tissue TWAS. In empirically motivated simulations, the sCCA+ACAT approach yielded the highest power to detect a gene associated with phenotype, even when expression in the causal tissue was not directly measured, while controlling the Type I error when there is no association between gene expression and phenotype. For example, when gene expression explains 2% of the variability in outcome, and the GWAS sample size is 20,000, the average power difference between the ACAT combined test of sCCA features and single-tissue, versus single-tissue combined with Generalized Berk-Jones (GBJ) method, single-tissue combined with S-MultiXcan, UTMOST, or summarizing cross-tissue expression patterns using Principal Component Analysis (PCA) approaches was 5%, 8%, 5% and 38%, respectively. The gain in power is likely due to sCCA cross-tissue features being more likely to be detectably heritable. When applied to publicly available summary statistics from 10 complex traits, the sCCA+ACAT test was able to increase the number of testable genes and identify on average an additional 400 additional gene-trait associations that single-trait TWAS missed. Our results suggest that aggregating eQTL data across multiple tissues using sCCA can improve the sensitivity of TWAS while controlling for the false positive rate. Public Library of Science 2021-04-08 /pmc/articles/PMC8057593/ /pubmed/33831007 http://dx.doi.org/10.1371/journal.pgen.1008973 Text en © 2021 Feng et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Feng, Helian Mancuso, Nicholas Gusev, Alexander Majumdar, Arunabha Major, Megan Pasaniuc, Bogdan Kraft, Peter Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies |
title | Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies |
title_full | Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies |
title_fullStr | Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies |
title_full_unstemmed | Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies |
title_short | Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies |
title_sort | leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8057593/ https://www.ncbi.nlm.nih.gov/pubmed/33831007 http://dx.doi.org/10.1371/journal.pgen.1008973 |
work_keys_str_mv | AT fenghelian leveragingexpressionfrommultipletissuesusingsparsecanonicalcorrelationanalysisandaggregatetestsimprovesthepoweroftranscriptomewideassociationstudies AT mancusonicholas leveragingexpressionfrommultipletissuesusingsparsecanonicalcorrelationanalysisandaggregatetestsimprovesthepoweroftranscriptomewideassociationstudies AT gusevalexander leveragingexpressionfrommultipletissuesusingsparsecanonicalcorrelationanalysisandaggregatetestsimprovesthepoweroftranscriptomewideassociationstudies AT majumdararunabha leveragingexpressionfrommultipletissuesusingsparsecanonicalcorrelationanalysisandaggregatetestsimprovesthepoweroftranscriptomewideassociationstudies AT majormegan leveragingexpressionfrommultipletissuesusingsparsecanonicalcorrelationanalysisandaggregatetestsimprovesthepoweroftranscriptomewideassociationstudies AT pasaniucbogdan leveragingexpressionfrommultipletissuesusingsparsecanonicalcorrelationanalysisandaggregatetestsimprovesthepoweroftranscriptomewideassociationstudies AT kraftpeter leveragingexpressionfrommultipletissuesusingsparsecanonicalcorrelationanalysisandaggregatetestsimprovesthepoweroftranscriptomewideassociationstudies |