Cargando…
Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis
BACKGROUND: Genome-wide gene-gene interaction analysis using single nucleotide polymorphisms (SNPs) is an attractive way for identification of genetic components that confers susceptibility of human complex diseases. Individual hypothesis testing for SNP-SNP pairs as in common genome-wide associatio...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3531267/ https://www.ncbi.nlm.nih.gov/pubmed/22554139 http://dx.doi.org/10.1186/1471-2105-13-72 |
_version_ | 1782254144681148416 |
---|---|
author | Ueki, Masao Tamiya, Gen |
author_facet | Ueki, Masao Tamiya, Gen |
author_sort | Ueki, Masao |
collection | PubMed |
description | BACKGROUND: Genome-wide gene-gene interaction analysis using single nucleotide polymorphisms (SNPs) is an attractive way for identification of genetic components that confers susceptibility of human complex diseases. Individual hypothesis testing for SNP-SNP pairs as in common genome-wide association study (GWAS) however involves difficulty in setting overall p-value due to complicated correlation structure, namely, the multiple testing problem that causes unacceptable false negative results. A large number of SNP-SNP pairs than sample size, so-called the large p small n problem, precludes simultaneous analysis using multiple regression. The method that overcomes above issues is thus needed. RESULTS: We adopt an up-to-date method for ultrahigh-dimensional variable selection termed the sure independence screening (SIS) for appropriate handling of numerous number of SNP-SNP interactions by including them as predictor variables in logistic regression. We propose ranking strategy using promising dummy coding methods and following variable selection procedure in the SIS method suitably modified for gene-gene interaction analysis. We also implemented the procedures in a software program, EPISIS, using the cost-effective GPGPU (General-purpose computing on graphics processing units) technology. EPISIS can complete exhaustive search for SNP-SNP interactions in standard GWAS dataset within several hours. The proposed method works successfully in simulation experiments and in application to real WTCCC (Wellcome Trust Case–control Consortium) data. CONCLUSIONS: Based on the machine-learning principle, the proposed method gives powerful and flexible genome-wide search for various patterns of gene-gene interaction. |
format | Online Article Text |
id | pubmed-3531267 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-35312672013-01-03 Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis Ueki, Masao Tamiya, Gen BMC Bioinformatics Methodology Article BACKGROUND: Genome-wide gene-gene interaction analysis using single nucleotide polymorphisms (SNPs) is an attractive way for identification of genetic components that confers susceptibility of human complex diseases. Individual hypothesis testing for SNP-SNP pairs as in common genome-wide association study (GWAS) however involves difficulty in setting overall p-value due to complicated correlation structure, namely, the multiple testing problem that causes unacceptable false negative results. A large number of SNP-SNP pairs than sample size, so-called the large p small n problem, precludes simultaneous analysis using multiple regression. The method that overcomes above issues is thus needed. RESULTS: We adopt an up-to-date method for ultrahigh-dimensional variable selection termed the sure independence screening (SIS) for appropriate handling of numerous number of SNP-SNP interactions by including them as predictor variables in logistic regression. We propose ranking strategy using promising dummy coding methods and following variable selection procedure in the SIS method suitably modified for gene-gene interaction analysis. We also implemented the procedures in a software program, EPISIS, using the cost-effective GPGPU (General-purpose computing on graphics processing units) technology. EPISIS can complete exhaustive search for SNP-SNP interactions in standard GWAS dataset within several hours. The proposed method works successfully in simulation experiments and in application to real WTCCC (Wellcome Trust Case–control Consortium) data. CONCLUSIONS: Based on the machine-learning principle, the proposed method gives powerful and flexible genome-wide search for various patterns of gene-gene interaction. BioMed Central 2012-05-03 /pmc/articles/PMC3531267/ /pubmed/22554139 http://dx.doi.org/10.1186/1471-2105-13-72 Text en Copyright ©2012 Ueki and Tamiya; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Ueki, Masao Tamiya, Gen Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis |
title | Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis |
title_full | Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis |
title_fullStr | Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis |
title_full_unstemmed | Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis |
title_short | Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis |
title_sort | ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3531267/ https://www.ncbi.nlm.nih.gov/pubmed/22554139 http://dx.doi.org/10.1186/1471-2105-13-72 |
work_keys_str_mv | AT uekimasao ultrahighdimensionalvariableselectionmethodforwholegenomegenegeneinteractionanalysis AT tamiyagen ultrahighdimensionalvariableselectionmethodforwholegenomegenegeneinteractionanalysis |