Cargando…
Identifying main effects and epistatic interactions from large-scale SNP data via adaptive group Lasso
BACKGROUND: Single nucleotide polymorphism (SNP) based association studies aim at identifying SNPs associated with phenotypes, for example, complex diseases. The associated SNPs may influence the disease risk individually (main effects) or behave jointly (epistatic interactions). For the analysis of...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3203332/ https://www.ncbi.nlm.nih.gov/pubmed/20122189 http://dx.doi.org/10.1186/1471-2105-11-S1-S18 |
_version_ | 1782215107488514048 |
---|---|
author | Yang, Can Wan, Xiang Yang, Qiang Xue, Hong Yu, Weichuan |
author_facet | Yang, Can Wan, Xiang Yang, Qiang Xue, Hong Yu, Weichuan |
author_sort | Yang, Can |
collection | PubMed |
description | BACKGROUND: Single nucleotide polymorphism (SNP) based association studies aim at identifying SNPs associated with phenotypes, for example, complex diseases. The associated SNPs may influence the disease risk individually (main effects) or behave jointly (epistatic interactions). For the analysis of high throughput data, the main difficulty is that the number of SNPs far exceeds the number of samples. This difficulty is amplified when identifying interactions. RESULTS: In this paper, we propose an Adaptive Group Lasso (AGL) model for large-scale association studies. Our model enables us to analyze SNPs and their interactions simultaneously. We achieve this by introducing a sparsity constraint in our model based on the fact that only a small fraction of SNPs is disease-associated. In order to reduce the number of false positive findings, we develop an adaptive reweighting scheme to enhance sparsity. In addition, our method treats SNPs and their interactions as factors, and identifies them in a grouped manner. Thus, it is flexible to analyze various disease models, especially for interaction detection. However, due to the intensive computation when millions of interaction terms needs to be searched in the model fitting, our method needs to combined with some filtering methods when applied to genome-wide data for detecting interactions. CONCLUSION: By using a wide range of simulated datasets and a real dataset from WTCCC, we demonstrate the advantages of our method. |
format | Online Article Text |
id | pubmed-3203332 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32033322011-10-29 Identifying main effects and epistatic interactions from large-scale SNP data via adaptive group Lasso Yang, Can Wan, Xiang Yang, Qiang Xue, Hong Yu, Weichuan BMC Bioinformatics Research BACKGROUND: Single nucleotide polymorphism (SNP) based association studies aim at identifying SNPs associated with phenotypes, for example, complex diseases. The associated SNPs may influence the disease risk individually (main effects) or behave jointly (epistatic interactions). For the analysis of high throughput data, the main difficulty is that the number of SNPs far exceeds the number of samples. This difficulty is amplified when identifying interactions. RESULTS: In this paper, we propose an Adaptive Group Lasso (AGL) model for large-scale association studies. Our model enables us to analyze SNPs and their interactions simultaneously. We achieve this by introducing a sparsity constraint in our model based on the fact that only a small fraction of SNPs is disease-associated. In order to reduce the number of false positive findings, we develop an adaptive reweighting scheme to enhance sparsity. In addition, our method treats SNPs and their interactions as factors, and identifies them in a grouped manner. Thus, it is flexible to analyze various disease models, especially for interaction detection. However, due to the intensive computation when millions of interaction terms needs to be searched in the model fitting, our method needs to combined with some filtering methods when applied to genome-wide data for detecting interactions. CONCLUSION: By using a wide range of simulated datasets and a real dataset from WTCCC, we demonstrate the advantages of our method. BioMed Central 2010-01-18 /pmc/articles/PMC3203332/ /pubmed/20122189 http://dx.doi.org/10.1186/1471-2105-11-S1-S18 Text en Copyright ©2010 Yang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Yang, Can Wan, Xiang Yang, Qiang Xue, Hong Yu, Weichuan Identifying main effects and epistatic interactions from large-scale SNP data via adaptive group Lasso |
title | Identifying main effects and epistatic interactions from large-scale SNP data via adaptive group Lasso |
title_full | Identifying main effects and epistatic interactions from large-scale SNP data via adaptive group Lasso |
title_fullStr | Identifying main effects and epistatic interactions from large-scale SNP data via adaptive group Lasso |
title_full_unstemmed | Identifying main effects and epistatic interactions from large-scale SNP data via adaptive group Lasso |
title_short | Identifying main effects and epistatic interactions from large-scale SNP data via adaptive group Lasso |
title_sort | identifying main effects and epistatic interactions from large-scale snp data via adaptive group lasso |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3203332/ https://www.ncbi.nlm.nih.gov/pubmed/20122189 http://dx.doi.org/10.1186/1471-2105-11-S1-S18 |
work_keys_str_mv | AT yangcan identifyingmaineffectsandepistaticinteractionsfromlargescalesnpdataviaadaptivegrouplasso AT wanxiang identifyingmaineffectsandepistaticinteractionsfromlargescalesnpdataviaadaptivegrouplasso AT yangqiang identifyingmaineffectsandepistaticinteractionsfromlargescalesnpdataviaadaptivegrouplasso AT xuehong identifyingmaineffectsandepistaticinteractionsfromlargescalesnpdataviaadaptivegrouplasso AT yuweichuan identifyingmaineffectsandepistaticinteractionsfromlargescalesnpdataviaadaptivegrouplasso |