Cargando…

Association Test Based on SNP Set: Logistic Kernel Machine Based Test vs. Principal Component Analysis

GWAS has facilitated greatly the discovery of risk SNPs associated with complex diseases. Traditional methods analyze SNP individually and are limited by low power and reproducibility since correction for multiple comparisons is necessary. Several methods have been proposed based on grouping SNPs in...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Yang, Chen, Feng, Zhai, Rihong, Lin, Xihong, Diao, Nancy, Christiani, David C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3441747/
https://www.ncbi.nlm.nih.gov/pubmed/23028716
http://dx.doi.org/10.1371/journal.pone.0044978
_version_ 1782243365528535040
author Zhao, Yang
Chen, Feng
Zhai, Rihong
Lin, Xihong
Diao, Nancy
Christiani, David C.
author_facet Zhao, Yang
Chen, Feng
Zhai, Rihong
Lin, Xihong
Diao, Nancy
Christiani, David C.
author_sort Zhao, Yang
collection PubMed
description GWAS has facilitated greatly the discovery of risk SNPs associated with complex diseases. Traditional methods analyze SNP individually and are limited by low power and reproducibility since correction for multiple comparisons is necessary. Several methods have been proposed based on grouping SNPs into SNP sets using biological knowledge and/or genomic features. In this article, we compare the linear kernel machine based test (LKM) and principal components analysis based approach (PCA) using simulated datasets under the scenarios of 0 to 3 causal SNPs, as well as simple and complex linkage disequilibrium (LD) structures of the simulated regions. Our simulation study demonstrates that both LKM and PCA can control the type I error at the significance level of 0.05. If the causal SNP is in strong LD with the genotyped SNPs, both the PCA with a small number of principal components (PCs) and the LKM with kernel of linear or identical-by-state function are valid tests. However, if the LD structure is complex, such as several LD blocks in the SNP set, or when the causal SNP is not in the LD block in which most of the genotyped SNPs reside, more PCs should be included to capture the information of the causal SNP. Simulation studies also demonstrate the ability of LKM and PCA to combine information from multiple causal SNPs and to provide increased power over individual SNP analysis. We also apply LKM and PCA to analyze two SNP sets extracted from an actual GWAS dataset on non-small cell lung cancer.
format Online
Article
Text
id pubmed-3441747
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-34417472012-10-01 Association Test Based on SNP Set: Logistic Kernel Machine Based Test vs. Principal Component Analysis Zhao, Yang Chen, Feng Zhai, Rihong Lin, Xihong Diao, Nancy Christiani, David C. PLoS One Research Article GWAS has facilitated greatly the discovery of risk SNPs associated with complex diseases. Traditional methods analyze SNP individually and are limited by low power and reproducibility since correction for multiple comparisons is necessary. Several methods have been proposed based on grouping SNPs into SNP sets using biological knowledge and/or genomic features. In this article, we compare the linear kernel machine based test (LKM) and principal components analysis based approach (PCA) using simulated datasets under the scenarios of 0 to 3 causal SNPs, as well as simple and complex linkage disequilibrium (LD) structures of the simulated regions. Our simulation study demonstrates that both LKM and PCA can control the type I error at the significance level of 0.05. If the causal SNP is in strong LD with the genotyped SNPs, both the PCA with a small number of principal components (PCs) and the LKM with kernel of linear or identical-by-state function are valid tests. However, if the LD structure is complex, such as several LD blocks in the SNP set, or when the causal SNP is not in the LD block in which most of the genotyped SNPs reside, more PCs should be included to capture the information of the causal SNP. Simulation studies also demonstrate the ability of LKM and PCA to combine information from multiple causal SNPs and to provide increased power over individual SNP analysis. We also apply LKM and PCA to analyze two SNP sets extracted from an actual GWAS dataset on non-small cell lung cancer. Public Library of Science 2012-09-13 /pmc/articles/PMC3441747/ /pubmed/23028716 http://dx.doi.org/10.1371/journal.pone.0044978 Text en © 2012 Zhao et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Zhao, Yang
Chen, Feng
Zhai, Rihong
Lin, Xihong
Diao, Nancy
Christiani, David C.
Association Test Based on SNP Set: Logistic Kernel Machine Based Test vs. Principal Component Analysis
title Association Test Based on SNP Set: Logistic Kernel Machine Based Test vs. Principal Component Analysis
title_full Association Test Based on SNP Set: Logistic Kernel Machine Based Test vs. Principal Component Analysis
title_fullStr Association Test Based on SNP Set: Logistic Kernel Machine Based Test vs. Principal Component Analysis
title_full_unstemmed Association Test Based on SNP Set: Logistic Kernel Machine Based Test vs. Principal Component Analysis
title_short Association Test Based on SNP Set: Logistic Kernel Machine Based Test vs. Principal Component Analysis
title_sort association test based on snp set: logistic kernel machine based test vs. principal component analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3441747/
https://www.ncbi.nlm.nih.gov/pubmed/23028716
http://dx.doi.org/10.1371/journal.pone.0044978
work_keys_str_mv AT zhaoyang associationtestbasedonsnpsetlogistickernelmachinebasedtestvsprincipalcomponentanalysis
AT chenfeng associationtestbasedonsnpsetlogistickernelmachinebasedtestvsprincipalcomponentanalysis
AT zhairihong associationtestbasedonsnpsetlogistickernelmachinebasedtestvsprincipalcomponentanalysis
AT linxihong associationtestbasedonsnpsetlogistickernelmachinebasedtestvsprincipalcomponentanalysis
AT diaonancy associationtestbasedonsnpsetlogistickernelmachinebasedtestvsprincipalcomponentanalysis
AT christianidavidc associationtestbasedonsnpsetlogistickernelmachinebasedtestvsprincipalcomponentanalysis