Cargando…

Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis

BACKGROUND: Genome-wide gene-gene interaction analysis using single nucleotide polymorphisms (SNPs) is an attractive way for identification of genetic components that confers susceptibility of human complex diseases. Individual hypothesis testing for SNP-SNP pairs as in common genome-wide associatio...

Descripción completa

Detalles Bibliográficos
Autores principales: Ueki, Masao, Tamiya, Gen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3531267/
https://www.ncbi.nlm.nih.gov/pubmed/22554139
http://dx.doi.org/10.1186/1471-2105-13-72
_version_ 1782254144681148416
author Ueki, Masao
Tamiya, Gen
author_facet Ueki, Masao
Tamiya, Gen
author_sort Ueki, Masao
collection PubMed
description BACKGROUND: Genome-wide gene-gene interaction analysis using single nucleotide polymorphisms (SNPs) is an attractive way for identification of genetic components that confers susceptibility of human complex diseases. Individual hypothesis testing for SNP-SNP pairs as in common genome-wide association study (GWAS) however involves difficulty in setting overall p-value due to complicated correlation structure, namely, the multiple testing problem that causes unacceptable false negative results. A large number of SNP-SNP pairs than sample size, so-called the large p small n problem, precludes simultaneous analysis using multiple regression. The method that overcomes above issues is thus needed. RESULTS: We adopt an up-to-date method for ultrahigh-dimensional variable selection termed the sure independence screening (SIS) for appropriate handling of numerous number of SNP-SNP interactions by including them as predictor variables in logistic regression. We propose ranking strategy using promising dummy coding methods and following variable selection procedure in the SIS method suitably modified for gene-gene interaction analysis. We also implemented the procedures in a software program, EPISIS, using the cost-effective GPGPU (General-purpose computing on graphics processing units) technology. EPISIS can complete exhaustive search for SNP-SNP interactions in standard GWAS dataset within several hours. The proposed method works successfully in simulation experiments and in application to real WTCCC (Wellcome Trust Case–control Consortium) data. CONCLUSIONS: Based on the machine-learning principle, the proposed method gives powerful and flexible genome-wide search for various patterns of gene-gene interaction.
format Online
Article
Text
id pubmed-3531267
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35312672013-01-03 Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis Ueki, Masao Tamiya, Gen BMC Bioinformatics Methodology Article BACKGROUND: Genome-wide gene-gene interaction analysis using single nucleotide polymorphisms (SNPs) is an attractive way for identification of genetic components that confers susceptibility of human complex diseases. Individual hypothesis testing for SNP-SNP pairs as in common genome-wide association study (GWAS) however involves difficulty in setting overall p-value due to complicated correlation structure, namely, the multiple testing problem that causes unacceptable false negative results. A large number of SNP-SNP pairs than sample size, so-called the large p small n problem, precludes simultaneous analysis using multiple regression. The method that overcomes above issues is thus needed. RESULTS: We adopt an up-to-date method for ultrahigh-dimensional variable selection termed the sure independence screening (SIS) for appropriate handling of numerous number of SNP-SNP interactions by including them as predictor variables in logistic regression. We propose ranking strategy using promising dummy coding methods and following variable selection procedure in the SIS method suitably modified for gene-gene interaction analysis. We also implemented the procedures in a software program, EPISIS, using the cost-effective GPGPU (General-purpose computing on graphics processing units) technology. EPISIS can complete exhaustive search for SNP-SNP interactions in standard GWAS dataset within several hours. The proposed method works successfully in simulation experiments and in application to real WTCCC (Wellcome Trust Case–control Consortium) data. CONCLUSIONS: Based on the machine-learning principle, the proposed method gives powerful and flexible genome-wide search for various patterns of gene-gene interaction. BioMed Central 2012-05-03 /pmc/articles/PMC3531267/ /pubmed/22554139 http://dx.doi.org/10.1186/1471-2105-13-72 Text en Copyright ©2012 Ueki and Tamiya; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Ueki, Masao
Tamiya, Gen
Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis
title Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis
title_full Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis
title_fullStr Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis
title_full_unstemmed Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis
title_short Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis
title_sort ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3531267/
https://www.ncbi.nlm.nih.gov/pubmed/22554139
http://dx.doi.org/10.1186/1471-2105-13-72
work_keys_str_mv AT uekimasao ultrahighdimensionalvariableselectionmethodforwholegenomegenegeneinteractionanalysis
AT tamiyagen ultrahighdimensionalvariableselectionmethodforwholegenomegenegeneinteractionanalysis