Cargando…

Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies

Transcriptome-wide association studies (TWAS) test the association between traits and genetically predicted gene expression levels. The power of a TWAS depends in part on the strength of the correlation between a genetic predictor of gene expression and the causally relevant gene expression values....

Descripción completa

Detalles Bibliográficos
Autores principales: Feng, Helian, Mancuso, Nicholas, Gusev, Alexander, Majumdar, Arunabha, Major, Megan, Pasaniuc, Bogdan, Kraft, Peter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8057593/
https://www.ncbi.nlm.nih.gov/pubmed/33831007
http://dx.doi.org/10.1371/journal.pgen.1008973
_version_ 1783680867664658432
author Feng, Helian
Mancuso, Nicholas
Gusev, Alexander
Majumdar, Arunabha
Major, Megan
Pasaniuc, Bogdan
Kraft, Peter
author_facet Feng, Helian
Mancuso, Nicholas
Gusev, Alexander
Majumdar, Arunabha
Major, Megan
Pasaniuc, Bogdan
Kraft, Peter
author_sort Feng, Helian
collection PubMed
description Transcriptome-wide association studies (TWAS) test the association between traits and genetically predicted gene expression levels. The power of a TWAS depends in part on the strength of the correlation between a genetic predictor of gene expression and the causally relevant gene expression values. Consequently, TWAS power can be low when expression quantitative trait locus (eQTL) data used to train the genetic predictors have small sample sizes, or when data from causally relevant tissues are not available. Here, we propose to address these issues by integrating multiple tissues in the TWAS using sparse canonical correlation analysis (sCCA). We show that sCCA-TWAS combined with single-tissue TWAS using an aggregate Cauchy association test (ACAT) outperforms traditional single-tissue TWAS. In empirically motivated simulations, the sCCA+ACAT approach yielded the highest power to detect a gene associated with phenotype, even when expression in the causal tissue was not directly measured, while controlling the Type I error when there is no association between gene expression and phenotype. For example, when gene expression explains 2% of the variability in outcome, and the GWAS sample size is 20,000, the average power difference between the ACAT combined test of sCCA features and single-tissue, versus single-tissue combined with Generalized Berk-Jones (GBJ) method, single-tissue combined with S-MultiXcan, UTMOST, or summarizing cross-tissue expression patterns using Principal Component Analysis (PCA) approaches was 5%, 8%, 5% and 38%, respectively. The gain in power is likely due to sCCA cross-tissue features being more likely to be detectably heritable. When applied to publicly available summary statistics from 10 complex traits, the sCCA+ACAT test was able to increase the number of testable genes and identify on average an additional 400 additional gene-trait associations that single-trait TWAS missed. Our results suggest that aggregating eQTL data across multiple tissues using sCCA can improve the sensitivity of TWAS while controlling for the false positive rate.
format Online
Article
Text
id pubmed-8057593
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-80575932021-05-04 Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies Feng, Helian Mancuso, Nicholas Gusev, Alexander Majumdar, Arunabha Major, Megan Pasaniuc, Bogdan Kraft, Peter PLoS Genet Research Article Transcriptome-wide association studies (TWAS) test the association between traits and genetically predicted gene expression levels. The power of a TWAS depends in part on the strength of the correlation between a genetic predictor of gene expression and the causally relevant gene expression values. Consequently, TWAS power can be low when expression quantitative trait locus (eQTL) data used to train the genetic predictors have small sample sizes, or when data from causally relevant tissues are not available. Here, we propose to address these issues by integrating multiple tissues in the TWAS using sparse canonical correlation analysis (sCCA). We show that sCCA-TWAS combined with single-tissue TWAS using an aggregate Cauchy association test (ACAT) outperforms traditional single-tissue TWAS. In empirically motivated simulations, the sCCA+ACAT approach yielded the highest power to detect a gene associated with phenotype, even when expression in the causal tissue was not directly measured, while controlling the Type I error when there is no association between gene expression and phenotype. For example, when gene expression explains 2% of the variability in outcome, and the GWAS sample size is 20,000, the average power difference between the ACAT combined test of sCCA features and single-tissue, versus single-tissue combined with Generalized Berk-Jones (GBJ) method, single-tissue combined with S-MultiXcan, UTMOST, or summarizing cross-tissue expression patterns using Principal Component Analysis (PCA) approaches was 5%, 8%, 5% and 38%, respectively. The gain in power is likely due to sCCA cross-tissue features being more likely to be detectably heritable. When applied to publicly available summary statistics from 10 complex traits, the sCCA+ACAT test was able to increase the number of testable genes and identify on average an additional 400 additional gene-trait associations that single-trait TWAS missed. Our results suggest that aggregating eQTL data across multiple tissues using sCCA can improve the sensitivity of TWAS while controlling for the false positive rate. Public Library of Science 2021-04-08 /pmc/articles/PMC8057593/ /pubmed/33831007 http://dx.doi.org/10.1371/journal.pgen.1008973 Text en © 2021 Feng et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Feng, Helian
Mancuso, Nicholas
Gusev, Alexander
Majumdar, Arunabha
Major, Megan
Pasaniuc, Bogdan
Kraft, Peter
Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies
title Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies
title_full Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies
title_fullStr Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies
title_full_unstemmed Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies
title_short Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies
title_sort leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8057593/
https://www.ncbi.nlm.nih.gov/pubmed/33831007
http://dx.doi.org/10.1371/journal.pgen.1008973
work_keys_str_mv AT fenghelian leveragingexpressionfrommultipletissuesusingsparsecanonicalcorrelationanalysisandaggregatetestsimprovesthepoweroftranscriptomewideassociationstudies
AT mancusonicholas leveragingexpressionfrommultipletissuesusingsparsecanonicalcorrelationanalysisandaggregatetestsimprovesthepoweroftranscriptomewideassociationstudies
AT gusevalexander leveragingexpressionfrommultipletissuesusingsparsecanonicalcorrelationanalysisandaggregatetestsimprovesthepoweroftranscriptomewideassociationstudies
AT majumdararunabha leveragingexpressionfrommultipletissuesusingsparsecanonicalcorrelationanalysisandaggregatetestsimprovesthepoweroftranscriptomewideassociationstudies
AT majormegan leveragingexpressionfrommultipletissuesusingsparsecanonicalcorrelationanalysisandaggregatetestsimprovesthepoweroftranscriptomewideassociationstudies
AT pasaniucbogdan leveragingexpressionfrommultipletissuesusingsparsecanonicalcorrelationanalysisandaggregatetestsimprovesthepoweroftranscriptomewideassociationstudies
AT kraftpeter leveragingexpressionfrommultipletissuesusingsparsecanonicalcorrelationanalysisandaggregatetestsimprovesthepoweroftranscriptomewideassociationstudies