Cargando…

A genetic ensemble approach for gene-gene interaction identification

BACKGROUND: It has now become clear that gene-gene interactions and gene-environment interactions are ubiquitous and fundamental mechanisms for the development of complex diseases. Though a considerable effort has been put into developing statistical models and algorithmic strategies for identifying...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Pengyi, Ho, Joshua WK, Zomaya, Albert Y, Zhou, Bing B
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2973963/
https://www.ncbi.nlm.nih.gov/pubmed/20961462
http://dx.doi.org/10.1186/1471-2105-11-524
_version_ 1782190855019298816
author Yang, Pengyi
Ho, Joshua WK
Zomaya, Albert Y
Zhou, Bing B
author_facet Yang, Pengyi
Ho, Joshua WK
Zomaya, Albert Y
Zhou, Bing B
author_sort Yang, Pengyi
collection PubMed
description BACKGROUND: It has now become clear that gene-gene interactions and gene-environment interactions are ubiquitous and fundamental mechanisms for the development of complex diseases. Though a considerable effort has been put into developing statistical models and algorithmic strategies for identifying such interactions, the accurate identification of those genetic interactions has been proven to be very challenging. METHODS: In this paper, we propose a new approach for identifying such gene-gene and gene-environment interactions underlying complex diseases. This is a hybrid algorithm and it combines genetic algorithm (GA) and an ensemble of classifiers (called genetic ensemble). Using this approach, the original problem of SNP interaction identification is converted into a data mining problem of combinatorial feature selection. By collecting various single nucleotide polymorphisms (SNP) subsets as well as environmental factors generated in multiple GA runs, patterns of gene-gene and gene-environment interactions can be extracted using a simple combinatorial ranking method. Also considered in this study is the idea of combining identification results obtained from multiple algorithms. A novel formula based on pairwise double fault is designed to quantify the degree of complementarity. CONCLUSIONS: Our simulation study demonstrates that the proposed genetic ensemble algorithm has comparable identification power to Multifactor Dimensionality Reduction (MDR) and is slightly better than Polymorphism Interaction Analysis (PIA), which are the two most popular methods for gene-gene interaction identification. More importantly, the identification results generated by using our genetic ensemble algorithm are highly complementary to those obtained by PIA and MDR. Experimental results from our simulation studies and real world data application also confirm the effectiveness of the proposed genetic ensemble algorithm, as well as the potential benefits of combining identification results from different algorithms.
format Text
id pubmed-2973963
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-29739632010-11-05 A genetic ensemble approach for gene-gene interaction identification Yang, Pengyi Ho, Joshua WK Zomaya, Albert Y Zhou, Bing B BMC Bioinformatics Methodology Article BACKGROUND: It has now become clear that gene-gene interactions and gene-environment interactions are ubiquitous and fundamental mechanisms for the development of complex diseases. Though a considerable effort has been put into developing statistical models and algorithmic strategies for identifying such interactions, the accurate identification of those genetic interactions has been proven to be very challenging. METHODS: In this paper, we propose a new approach for identifying such gene-gene and gene-environment interactions underlying complex diseases. This is a hybrid algorithm and it combines genetic algorithm (GA) and an ensemble of classifiers (called genetic ensemble). Using this approach, the original problem of SNP interaction identification is converted into a data mining problem of combinatorial feature selection. By collecting various single nucleotide polymorphisms (SNP) subsets as well as environmental factors generated in multiple GA runs, patterns of gene-gene and gene-environment interactions can be extracted using a simple combinatorial ranking method. Also considered in this study is the idea of combining identification results obtained from multiple algorithms. A novel formula based on pairwise double fault is designed to quantify the degree of complementarity. CONCLUSIONS: Our simulation study demonstrates that the proposed genetic ensemble algorithm has comparable identification power to Multifactor Dimensionality Reduction (MDR) and is slightly better than Polymorphism Interaction Analysis (PIA), which are the two most popular methods for gene-gene interaction identification. More importantly, the identification results generated by using our genetic ensemble algorithm are highly complementary to those obtained by PIA and MDR. Experimental results from our simulation studies and real world data application also confirm the effectiveness of the proposed genetic ensemble algorithm, as well as the potential benefits of combining identification results from different algorithms. BioMed Central 2010-10-21 /pmc/articles/PMC2973963/ /pubmed/20961462 http://dx.doi.org/10.1186/1471-2105-11-524 Text en Copyright ©2010 Yang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Yang, Pengyi
Ho, Joshua WK
Zomaya, Albert Y
Zhou, Bing B
A genetic ensemble approach for gene-gene interaction identification
title A genetic ensemble approach for gene-gene interaction identification
title_full A genetic ensemble approach for gene-gene interaction identification
title_fullStr A genetic ensemble approach for gene-gene interaction identification
title_full_unstemmed A genetic ensemble approach for gene-gene interaction identification
title_short A genetic ensemble approach for gene-gene interaction identification
title_sort genetic ensemble approach for gene-gene interaction identification
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2973963/
https://www.ncbi.nlm.nih.gov/pubmed/20961462
http://dx.doi.org/10.1186/1471-2105-11-524
work_keys_str_mv AT yangpengyi ageneticensembleapproachforgenegeneinteractionidentification
AT hojoshuawk ageneticensembleapproachforgenegeneinteractionidentification
AT zomayaalberty ageneticensembleapproachforgenegeneinteractionidentification
AT zhoubingb ageneticensembleapproachforgenegeneinteractionidentification
AT yangpengyi geneticensembleapproachforgenegeneinteractionidentification
AT hojoshuawk geneticensembleapproachforgenegeneinteractionidentification
AT zomayaalberty geneticensembleapproachforgenegeneinteractionidentification
AT zhoubingb geneticensembleapproachforgenegeneinteractionidentification