Cargando…
A computationally efficient clustering linear combination approach to jointly analyze multiple phenotypes for GWAS
There has been an increasing interest in joint analysis of multiple phenotypes in genome-wide association studies (GWAS) because jointly analyzing multiple phenotypes may increase statistical power to detect genetic variants associated with complex diseases or traits. Recently, many statistical meth...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9049312/ https://www.ncbi.nlm.nih.gov/pubmed/35482827 http://dx.doi.org/10.1371/journal.pone.0260911 |
_version_ | 1784696116637335552 |
---|---|
author | Wang, Meida Zhang, Shuanglin Sha, Qiuying |
author_facet | Wang, Meida Zhang, Shuanglin Sha, Qiuying |
author_sort | Wang, Meida |
collection | PubMed |
description | There has been an increasing interest in joint analysis of multiple phenotypes in genome-wide association studies (GWAS) because jointly analyzing multiple phenotypes may increase statistical power to detect genetic variants associated with complex diseases or traits. Recently, many statistical methods have been developed for joint analysis of multiple phenotypes in genetic association studies, including the Clustering Linear Combination (CLC) method. The CLC method works particularly well with phenotypes that have natural groupings, but due to the unknown number of clusters for a given data, the final test statistic of CLC method is the minimum p-value among all p-values of the CLC test statistics obtained from each possible number of clusters. Therefore, a simulation procedure needs to be used to evaluate the p-value of the final test statistic. This makes the CLC method computationally demanding. We develop a new method called computationally efficient CLC (ceCLC) to test the association between multiple phenotypes and a genetic variant. Instead of using the minimum p-value as the test statistic in the CLC method, ceCLC uses the Cauchy combination test to combine all p-values of the CLC test statistics obtained from each possible number of clusters. The test statistic of ceCLC approximately follows a standard Cauchy distribution, so the p-value can be obtained from the cumulative density function without the need for the simulation procedure. Through extensive simulation studies and application on the COPDGene data, the results demonstrate that the type I error rates of ceCLC are effectively controlled in different simulation settings and ceCLC either outperforms all other methods or has statistical power that is very close to the most powerful method with which it has been compared. |
format | Online Article Text |
id | pubmed-9049312 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-90493122022-04-29 A computationally efficient clustering linear combination approach to jointly analyze multiple phenotypes for GWAS Wang, Meida Zhang, Shuanglin Sha, Qiuying PLoS One Research Article There has been an increasing interest in joint analysis of multiple phenotypes in genome-wide association studies (GWAS) because jointly analyzing multiple phenotypes may increase statistical power to detect genetic variants associated with complex diseases or traits. Recently, many statistical methods have been developed for joint analysis of multiple phenotypes in genetic association studies, including the Clustering Linear Combination (CLC) method. The CLC method works particularly well with phenotypes that have natural groupings, but due to the unknown number of clusters for a given data, the final test statistic of CLC method is the minimum p-value among all p-values of the CLC test statistics obtained from each possible number of clusters. Therefore, a simulation procedure needs to be used to evaluate the p-value of the final test statistic. This makes the CLC method computationally demanding. We develop a new method called computationally efficient CLC (ceCLC) to test the association between multiple phenotypes and a genetic variant. Instead of using the minimum p-value as the test statistic in the CLC method, ceCLC uses the Cauchy combination test to combine all p-values of the CLC test statistics obtained from each possible number of clusters. The test statistic of ceCLC approximately follows a standard Cauchy distribution, so the p-value can be obtained from the cumulative density function without the need for the simulation procedure. Through extensive simulation studies and application on the COPDGene data, the results demonstrate that the type I error rates of ceCLC are effectively controlled in different simulation settings and ceCLC either outperforms all other methods or has statistical power that is very close to the most powerful method with which it has been compared. Public Library of Science 2022-04-28 /pmc/articles/PMC9049312/ /pubmed/35482827 http://dx.doi.org/10.1371/journal.pone.0260911 Text en © 2022 Wang et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Wang, Meida Zhang, Shuanglin Sha, Qiuying A computationally efficient clustering linear combination approach to jointly analyze multiple phenotypes for GWAS |
title | A computationally efficient clustering linear combination approach to jointly analyze multiple phenotypes for GWAS |
title_full | A computationally efficient clustering linear combination approach to jointly analyze multiple phenotypes for GWAS |
title_fullStr | A computationally efficient clustering linear combination approach to jointly analyze multiple phenotypes for GWAS |
title_full_unstemmed | A computationally efficient clustering linear combination approach to jointly analyze multiple phenotypes for GWAS |
title_short | A computationally efficient clustering linear combination approach to jointly analyze multiple phenotypes for GWAS |
title_sort | computationally efficient clustering linear combination approach to jointly analyze multiple phenotypes for gwas |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9049312/ https://www.ncbi.nlm.nih.gov/pubmed/35482827 http://dx.doi.org/10.1371/journal.pone.0260911 |
work_keys_str_mv | AT wangmeida acomputationallyefficientclusteringlinearcombinationapproachtojointlyanalyzemultiplephenotypesforgwas AT zhangshuanglin acomputationallyefficientclusteringlinearcombinationapproachtojointlyanalyzemultiplephenotypesforgwas AT shaqiuying acomputationallyefficientclusteringlinearcombinationapproachtojointlyanalyzemultiplephenotypesforgwas AT wangmeida computationallyefficientclusteringlinearcombinationapproachtojointlyanalyzemultiplephenotypesforgwas AT zhangshuanglin computationallyefficientclusteringlinearcombinationapproachtojointlyanalyzemultiplephenotypesforgwas AT shaqiuying computationallyefficientclusteringlinearcombinationapproachtojointlyanalyzemultiplephenotypesforgwas |