Cargando…

CoMM-S(4): A Collaborative Mixed Model Using Summary-Level eQTL and GWAS Datasets in Transcriptome-Wide Association Studies

Motivation: Genome-wide association studies (GWAS) have achieved remarkable success in identifying SNP-trait associations in the last decade. However, it is challenging to identify the mechanisms that connect the genetic variants with complex traits as the majority of GWAS associations are in non-co...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Yi, Yeung, Kar-Fu, Liu, Jin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8488198/
https://www.ncbi.nlm.nih.gov/pubmed/34616426
http://dx.doi.org/10.3389/fgene.2021.704538
_version_ 1784578108822650880
author Yang, Yi
Yeung, Kar-Fu
Liu, Jin
author_facet Yang, Yi
Yeung, Kar-Fu
Liu, Jin
author_sort Yang, Yi
collection PubMed
description Motivation: Genome-wide association studies (GWAS) have achieved remarkable success in identifying SNP-trait associations in the last decade. However, it is challenging to identify the mechanisms that connect the genetic variants with complex traits as the majority of GWAS associations are in non-coding regions. Methods that integrate genomic and transcriptomic data allow us to investigate how genetic variants may affect a trait through their effect on gene expression. These include CoMM and CoMM-S(2), likelihood-ratio-based methods that integrate GWAS and eQTL studies to assess expression-trait association. However, their reliance on individual-level eQTL data render them inapplicable when only summary-level eQTL results, such as those from large-scale eQTL analyses, are available. Result: We develop an efficient probabilistic model, CoMM-S(4), to explore the expression-trait association using summary-level eQTL and GWAS datasets. Compared with CoMM-S(2), which uses individual-level eQTL data, CoMM-S(4) requires only summary-level eQTL data. To test expression-trait association, an efficient variational Bayesian EM algorithm and a likelihood ratio test were constructed. We applied CoMM-S(4) to both simulated and real data. The simulation results demonstrate that CoMM-S(4) can perform as well as CoMM-S(2) and S-PrediXcan, and analyses using GWAS summary statistics from Biobank Japan and eQTL summary statistics from eQTLGen and GTEx suggest novel susceptibility loci for cardiovascular diseases and osteoporosis. Availability and implementation: The developed R package is available at https://github.com/gordonliu810822/CoMM.
format Online
Article
Text
id pubmed-8488198
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-84881982021-10-05 CoMM-S(4): A Collaborative Mixed Model Using Summary-Level eQTL and GWAS Datasets in Transcriptome-Wide Association Studies Yang, Yi Yeung, Kar-Fu Liu, Jin Front Genet Genetics Motivation: Genome-wide association studies (GWAS) have achieved remarkable success in identifying SNP-trait associations in the last decade. However, it is challenging to identify the mechanisms that connect the genetic variants with complex traits as the majority of GWAS associations are in non-coding regions. Methods that integrate genomic and transcriptomic data allow us to investigate how genetic variants may affect a trait through their effect on gene expression. These include CoMM and CoMM-S(2), likelihood-ratio-based methods that integrate GWAS and eQTL studies to assess expression-trait association. However, their reliance on individual-level eQTL data render them inapplicable when only summary-level eQTL results, such as those from large-scale eQTL analyses, are available. Result: We develop an efficient probabilistic model, CoMM-S(4), to explore the expression-trait association using summary-level eQTL and GWAS datasets. Compared with CoMM-S(2), which uses individual-level eQTL data, CoMM-S(4) requires only summary-level eQTL data. To test expression-trait association, an efficient variational Bayesian EM algorithm and a likelihood ratio test were constructed. We applied CoMM-S(4) to both simulated and real data. The simulation results demonstrate that CoMM-S(4) can perform as well as CoMM-S(2) and S-PrediXcan, and analyses using GWAS summary statistics from Biobank Japan and eQTL summary statistics from eQTLGen and GTEx suggest novel susceptibility loci for cardiovascular diseases and osteoporosis. Availability and implementation: The developed R package is available at https://github.com/gordonliu810822/CoMM. Frontiers Media S.A. 2021-09-20 /pmc/articles/PMC8488198/ /pubmed/34616426 http://dx.doi.org/10.3389/fgene.2021.704538 Text en Copyright © 2021 Yang, Yeung and Liu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Yang, Yi
Yeung, Kar-Fu
Liu, Jin
CoMM-S(4): A Collaborative Mixed Model Using Summary-Level eQTL and GWAS Datasets in Transcriptome-Wide Association Studies
title CoMM-S(4): A Collaborative Mixed Model Using Summary-Level eQTL and GWAS Datasets in Transcriptome-Wide Association Studies
title_full CoMM-S(4): A Collaborative Mixed Model Using Summary-Level eQTL and GWAS Datasets in Transcriptome-Wide Association Studies
title_fullStr CoMM-S(4): A Collaborative Mixed Model Using Summary-Level eQTL and GWAS Datasets in Transcriptome-Wide Association Studies
title_full_unstemmed CoMM-S(4): A Collaborative Mixed Model Using Summary-Level eQTL and GWAS Datasets in Transcriptome-Wide Association Studies
title_short CoMM-S(4): A Collaborative Mixed Model Using Summary-Level eQTL and GWAS Datasets in Transcriptome-Wide Association Studies
title_sort comm-s(4): a collaborative mixed model using summary-level eqtl and gwas datasets in transcriptome-wide association studies
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8488198/
https://www.ncbi.nlm.nih.gov/pubmed/34616426
http://dx.doi.org/10.3389/fgene.2021.704538
work_keys_str_mv AT yangyi comms4acollaborativemixedmodelusingsummaryleveleqtlandgwasdatasetsintranscriptomewideassociationstudies
AT yeungkarfu comms4acollaborativemixedmodelusingsummaryleveleqtlandgwasdatasetsintranscriptomewideassociationstudies
AT liujin comms4acollaborativemixedmodelusingsummaryleveleqtlandgwasdatasetsintranscriptomewideassociationstudies