Cargando…
A clustering linear combination method for multiple phenotype association studies based on GWAS summary statistics
There is strong evidence showing that joint analysis of multiple phenotypes in genome-wide association studies (GWAS) can increase statistical power when detecting the association between genetic variants and human complex diseases. We previously developed the Clustering Linear Combination (CLC) met...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9975197/ https://www.ncbi.nlm.nih.gov/pubmed/36854754 http://dx.doi.org/10.1038/s41598-023-30415-3 |
_version_ | 1784898822279790592 |
---|---|
author | Wang, Meida Cao, Xuewei Zhang, Shuanglin Sha, Qiuying |
author_facet | Wang, Meida Cao, Xuewei Zhang, Shuanglin Sha, Qiuying |
author_sort | Wang, Meida |
collection | PubMed |
description | There is strong evidence showing that joint analysis of multiple phenotypes in genome-wide association studies (GWAS) can increase statistical power when detecting the association between genetic variants and human complex diseases. We previously developed the Clustering Linear Combination (CLC) method and a computationally efficient CLC (ceCLC) method to test the association between multiple phenotypes and a genetic variant, which perform very well. However, both of these methods require individual-level genotypes and phenotypes that are often not easily accessible. In this research, we develop a novel method called sCLC for association studies of multiple phenotypes and a genetic variant based on GWAS summary statistics. We use the LD score regression to estimate the correlation matrix among phenotypes. The test statistic of sCLC is constructed by GWAS summary statistics and has an approximate Cauchy distribution. We perform a variety of simulation studies and compare sCLC with other commonly used methods for multiple phenotype association studies using GWAS summary statistics. Simulation results show that sCLC can control Type I error rates well and has the highest power in most scenarios. Moreover, we apply the newly developed method to the UK Biobank GWAS summary statistics from the XIII category with 70 related musculoskeletal system and connective tissue phenotypes. The results demonstrate that sCLC detects the most number of significant SNPs, and most of these identified SNPs can be matched to genes that have been reported in the GWAS catalog to be associated with those phenotypes. Furthermore, sCLC also identifies some novel signals that were missed by standard GWAS, which provide new insight into the potential genetic factors of the musculoskeletal system and connective tissue phenotypes. |
format | Online Article Text |
id | pubmed-9975197 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-99751972023-03-02 A clustering linear combination method for multiple phenotype association studies based on GWAS summary statistics Wang, Meida Cao, Xuewei Zhang, Shuanglin Sha, Qiuying Sci Rep Article There is strong evidence showing that joint analysis of multiple phenotypes in genome-wide association studies (GWAS) can increase statistical power when detecting the association between genetic variants and human complex diseases. We previously developed the Clustering Linear Combination (CLC) method and a computationally efficient CLC (ceCLC) method to test the association between multiple phenotypes and a genetic variant, which perform very well. However, both of these methods require individual-level genotypes and phenotypes that are often not easily accessible. In this research, we develop a novel method called sCLC for association studies of multiple phenotypes and a genetic variant based on GWAS summary statistics. We use the LD score regression to estimate the correlation matrix among phenotypes. The test statistic of sCLC is constructed by GWAS summary statistics and has an approximate Cauchy distribution. We perform a variety of simulation studies and compare sCLC with other commonly used methods for multiple phenotype association studies using GWAS summary statistics. Simulation results show that sCLC can control Type I error rates well and has the highest power in most scenarios. Moreover, we apply the newly developed method to the UK Biobank GWAS summary statistics from the XIII category with 70 related musculoskeletal system and connective tissue phenotypes. The results demonstrate that sCLC detects the most number of significant SNPs, and most of these identified SNPs can be matched to genes that have been reported in the GWAS catalog to be associated with those phenotypes. Furthermore, sCLC also identifies some novel signals that were missed by standard GWAS, which provide new insight into the potential genetic factors of the musculoskeletal system and connective tissue phenotypes. Nature Publishing Group UK 2023-02-28 /pmc/articles/PMC9975197/ /pubmed/36854754 http://dx.doi.org/10.1038/s41598-023-30415-3 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Wang, Meida Cao, Xuewei Zhang, Shuanglin Sha, Qiuying A clustering linear combination method for multiple phenotype association studies based on GWAS summary statistics |
title | A clustering linear combination method for multiple phenotype association studies based on GWAS summary statistics |
title_full | A clustering linear combination method for multiple phenotype association studies based on GWAS summary statistics |
title_fullStr | A clustering linear combination method for multiple phenotype association studies based on GWAS summary statistics |
title_full_unstemmed | A clustering linear combination method for multiple phenotype association studies based on GWAS summary statistics |
title_short | A clustering linear combination method for multiple phenotype association studies based on GWAS summary statistics |
title_sort | clustering linear combination method for multiple phenotype association studies based on gwas summary statistics |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9975197/ https://www.ncbi.nlm.nih.gov/pubmed/36854754 http://dx.doi.org/10.1038/s41598-023-30415-3 |
work_keys_str_mv | AT wangmeida aclusteringlinearcombinationmethodformultiplephenotypeassociationstudiesbasedongwassummarystatistics AT caoxuewei aclusteringlinearcombinationmethodformultiplephenotypeassociationstudiesbasedongwassummarystatistics AT zhangshuanglin aclusteringlinearcombinationmethodformultiplephenotypeassociationstudiesbasedongwassummarystatistics AT shaqiuying aclusteringlinearcombinationmethodformultiplephenotypeassociationstudiesbasedongwassummarystatistics AT wangmeida clusteringlinearcombinationmethodformultiplephenotypeassociationstudiesbasedongwassummarystatistics AT caoxuewei clusteringlinearcombinationmethodformultiplephenotypeassociationstudiesbasedongwassummarystatistics AT zhangshuanglin clusteringlinearcombinationmethodformultiplephenotypeassociationstudiesbasedongwassummarystatistics AT shaqiuying clusteringlinearcombinationmethodformultiplephenotypeassociationstudiesbasedongwassummarystatistics |