Cargando…

A computationally efficient clustering linear combination approach to jointly analyze multiple phenotypes for GWAS

There has been an increasing interest in joint analysis of multiple phenotypes in genome-wide association studies (GWAS) because jointly analyzing multiple phenotypes may increase statistical power to detect genetic variants associated with complex diseases or traits. Recently, many statistical meth...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Meida, Zhang, Shuanglin, Sha, Qiuying
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9049312/
https://www.ncbi.nlm.nih.gov/pubmed/35482827
http://dx.doi.org/10.1371/journal.pone.0260911
_version_ 1784696116637335552
author Wang, Meida
Zhang, Shuanglin
Sha, Qiuying
author_facet Wang, Meida
Zhang, Shuanglin
Sha, Qiuying
author_sort Wang, Meida
collection PubMed
description There has been an increasing interest in joint analysis of multiple phenotypes in genome-wide association studies (GWAS) because jointly analyzing multiple phenotypes may increase statistical power to detect genetic variants associated with complex diseases or traits. Recently, many statistical methods have been developed for joint analysis of multiple phenotypes in genetic association studies, including the Clustering Linear Combination (CLC) method. The CLC method works particularly well with phenotypes that have natural groupings, but due to the unknown number of clusters for a given data, the final test statistic of CLC method is the minimum p-value among all p-values of the CLC test statistics obtained from each possible number of clusters. Therefore, a simulation procedure needs to be used to evaluate the p-value of the final test statistic. This makes the CLC method computationally demanding. We develop a new method called computationally efficient CLC (ceCLC) to test the association between multiple phenotypes and a genetic variant. Instead of using the minimum p-value as the test statistic in the CLC method, ceCLC uses the Cauchy combination test to combine all p-values of the CLC test statistics obtained from each possible number of clusters. The test statistic of ceCLC approximately follows a standard Cauchy distribution, so the p-value can be obtained from the cumulative density function without the need for the simulation procedure. Through extensive simulation studies and application on the COPDGene data, the results demonstrate that the type I error rates of ceCLC are effectively controlled in different simulation settings and ceCLC either outperforms all other methods or has statistical power that is very close to the most powerful method with which it has been compared.
format Online
Article
Text
id pubmed-9049312
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-90493122022-04-29 A computationally efficient clustering linear combination approach to jointly analyze multiple phenotypes for GWAS Wang, Meida Zhang, Shuanglin Sha, Qiuying PLoS One Research Article There has been an increasing interest in joint analysis of multiple phenotypes in genome-wide association studies (GWAS) because jointly analyzing multiple phenotypes may increase statistical power to detect genetic variants associated with complex diseases or traits. Recently, many statistical methods have been developed for joint analysis of multiple phenotypes in genetic association studies, including the Clustering Linear Combination (CLC) method. The CLC method works particularly well with phenotypes that have natural groupings, but due to the unknown number of clusters for a given data, the final test statistic of CLC method is the minimum p-value among all p-values of the CLC test statistics obtained from each possible number of clusters. Therefore, a simulation procedure needs to be used to evaluate the p-value of the final test statistic. This makes the CLC method computationally demanding. We develop a new method called computationally efficient CLC (ceCLC) to test the association between multiple phenotypes and a genetic variant. Instead of using the minimum p-value as the test statistic in the CLC method, ceCLC uses the Cauchy combination test to combine all p-values of the CLC test statistics obtained from each possible number of clusters. The test statistic of ceCLC approximately follows a standard Cauchy distribution, so the p-value can be obtained from the cumulative density function without the need for the simulation procedure. Through extensive simulation studies and application on the COPDGene data, the results demonstrate that the type I error rates of ceCLC are effectively controlled in different simulation settings and ceCLC either outperforms all other methods or has statistical power that is very close to the most powerful method with which it has been compared. Public Library of Science 2022-04-28 /pmc/articles/PMC9049312/ /pubmed/35482827 http://dx.doi.org/10.1371/journal.pone.0260911 Text en © 2022 Wang et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Wang, Meida
Zhang, Shuanglin
Sha, Qiuying
A computationally efficient clustering linear combination approach to jointly analyze multiple phenotypes for GWAS
title A computationally efficient clustering linear combination approach to jointly analyze multiple phenotypes for GWAS
title_full A computationally efficient clustering linear combination approach to jointly analyze multiple phenotypes for GWAS
title_fullStr A computationally efficient clustering linear combination approach to jointly analyze multiple phenotypes for GWAS
title_full_unstemmed A computationally efficient clustering linear combination approach to jointly analyze multiple phenotypes for GWAS
title_short A computationally efficient clustering linear combination approach to jointly analyze multiple phenotypes for GWAS
title_sort computationally efficient clustering linear combination approach to jointly analyze multiple phenotypes for gwas
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9049312/
https://www.ncbi.nlm.nih.gov/pubmed/35482827
http://dx.doi.org/10.1371/journal.pone.0260911
work_keys_str_mv AT wangmeida acomputationallyefficientclusteringlinearcombinationapproachtojointlyanalyzemultiplephenotypesforgwas
AT zhangshuanglin acomputationallyefficientclusteringlinearcombinationapproachtojointlyanalyzemultiplephenotypesforgwas
AT shaqiuying acomputationallyefficientclusteringlinearcombinationapproachtojointlyanalyzemultiplephenotypesforgwas
AT wangmeida computationallyefficientclusteringlinearcombinationapproachtojointlyanalyzemultiplephenotypesforgwas
AT zhangshuanglin computationallyefficientclusteringlinearcombinationapproachtojointlyanalyzemultiplephenotypesforgwas
AT shaqiuying computationallyefficientclusteringlinearcombinationapproachtojointlyanalyzemultiplephenotypesforgwas