Cargando…

GWIS - model-free, fast and exhaustive search for epistatic interactions in case-control GWAS

BACKGROUND: It has been hypothesized that multivariate analysis and systematic detection of epistatic interactions between explanatory genotyping variables may help resolve the problem of "missing heritability" currently observed in genome-wide association studies (GWAS). However, even the...

Descripción completa

Detalles Bibliográficos
Autores principales: Goudey, Benjamin, Rawlinson, David, Wang, Qiao, Shi, Fan, Ferra, Herman, Campbell, Richard M, Stern, Linda, Inouye, Michael T, Ong, Cheng Soon, Kowalczyk, Adam
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3665501/
https://www.ncbi.nlm.nih.gov/pubmed/23819779
http://dx.doi.org/10.1186/1471-2164-14-S3-S10
_version_ 1782271262745165824
author Goudey, Benjamin
Rawlinson, David
Wang, Qiao
Shi, Fan
Ferra, Herman
Campbell, Richard M
Stern, Linda
Inouye, Michael T
Ong, Cheng Soon
Kowalczyk, Adam
author_facet Goudey, Benjamin
Rawlinson, David
Wang, Qiao
Shi, Fan
Ferra, Herman
Campbell, Richard M
Stern, Linda
Inouye, Michael T
Ong, Cheng Soon
Kowalczyk, Adam
author_sort Goudey, Benjamin
collection PubMed
description BACKGROUND: It has been hypothesized that multivariate analysis and systematic detection of epistatic interactions between explanatory genotyping variables may help resolve the problem of "missing heritability" currently observed in genome-wide association studies (GWAS). However, even the simplest bivariate analysis is still held back by significant statistical and computational challenges that are often addressed by reducing the set of analysed markers. Theoretically, it has been shown that combinations of loci may exist that show weak or no effects individually, but show significant (even complete) explanatory power over phenotype when combined. Reducing the set of analysed SNPs before bivariate analysis could easily omit such critical loci. RESULTS: We have developed an exhaustive bivariate GWAS analysis methodology that yields a manageable subset of candidate marker pairs for subsequent analysis using other, often more computationally expensive techniques. Our model-free filtering approach is based on classification using ROC curve analysis, an alternative to much slower regression-based modelling techniques. Exhaustive analysis of studies containing approximately 450,000 SNPs and 5,000 samples requires only 2 hours using a desktop CPU or 13 minutes using a GPU (Graphics Processing Unit). We validate our methodology with analysis of simulated datasets as well as the seven Wellcome Trust Case-Control Consortium datasets that represent a wide range of real life GWAS challenges. We have identified SNP pairs that have considerably stronger association with disease than their individual component SNPs that often show negligible effect univariately. When compared against previously reported results in the literature, our methods re-detect most significant SNP-pairs and additionally detect many pairs absent from the literature that show strong association with disease. The high overlap suggests that our fast analysis could substitute for some slower alternatives. CONCLUSIONS: We demonstrate that the proposed methodology is robust, fast and capable of exhaustive search for epistatic interactions using a standard desktop computer. First, our implementation is significantly faster than timings for comparable algorithms reported in the literature, especially as our method allows simultaneous use of multiple statistical filters with low computing time overhead. Second, for some diseases, we have identified hundreds of SNP pairs that pass formal multiple test (Bonferroni) correction and could form a rich source of hypotheses for follow-up analysis. AVAILABILITY: A web-based version of the software used for this analysis is available at http://bioinformatics.research.nicta.com.au/gwis.
format Online
Article
Text
id pubmed-3665501
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36655012013-06-05 GWIS - model-free, fast and exhaustive search for epistatic interactions in case-control GWAS Goudey, Benjamin Rawlinson, David Wang, Qiao Shi, Fan Ferra, Herman Campbell, Richard M Stern, Linda Inouye, Michael T Ong, Cheng Soon Kowalczyk, Adam BMC Genomics Research BACKGROUND: It has been hypothesized that multivariate analysis and systematic detection of epistatic interactions between explanatory genotyping variables may help resolve the problem of "missing heritability" currently observed in genome-wide association studies (GWAS). However, even the simplest bivariate analysis is still held back by significant statistical and computational challenges that are often addressed by reducing the set of analysed markers. Theoretically, it has been shown that combinations of loci may exist that show weak or no effects individually, but show significant (even complete) explanatory power over phenotype when combined. Reducing the set of analysed SNPs before bivariate analysis could easily omit such critical loci. RESULTS: We have developed an exhaustive bivariate GWAS analysis methodology that yields a manageable subset of candidate marker pairs for subsequent analysis using other, often more computationally expensive techniques. Our model-free filtering approach is based on classification using ROC curve analysis, an alternative to much slower regression-based modelling techniques. Exhaustive analysis of studies containing approximately 450,000 SNPs and 5,000 samples requires only 2 hours using a desktop CPU or 13 minutes using a GPU (Graphics Processing Unit). We validate our methodology with analysis of simulated datasets as well as the seven Wellcome Trust Case-Control Consortium datasets that represent a wide range of real life GWAS challenges. We have identified SNP pairs that have considerably stronger association with disease than their individual component SNPs that often show negligible effect univariately. When compared against previously reported results in the literature, our methods re-detect most significant SNP-pairs and additionally detect many pairs absent from the literature that show strong association with disease. The high overlap suggests that our fast analysis could substitute for some slower alternatives. CONCLUSIONS: We demonstrate that the proposed methodology is robust, fast and capable of exhaustive search for epistatic interactions using a standard desktop computer. First, our implementation is significantly faster than timings for comparable algorithms reported in the literature, especially as our method allows simultaneous use of multiple statistical filters with low computing time overhead. Second, for some diseases, we have identified hundreds of SNP pairs that pass formal multiple test (Bonferroni) correction and could form a rich source of hypotheses for follow-up analysis. AVAILABILITY: A web-based version of the software used for this analysis is available at http://bioinformatics.research.nicta.com.au/gwis. BioMed Central 2013-05-28 /pmc/articles/PMC3665501/ /pubmed/23819779 http://dx.doi.org/10.1186/1471-2164-14-S3-S10 Text en Copyright © 2013 Goudey et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Goudey, Benjamin
Rawlinson, David
Wang, Qiao
Shi, Fan
Ferra, Herman
Campbell, Richard M
Stern, Linda
Inouye, Michael T
Ong, Cheng Soon
Kowalczyk, Adam
GWIS - model-free, fast and exhaustive search for epistatic interactions in case-control GWAS
title GWIS - model-free, fast and exhaustive search for epistatic interactions in case-control GWAS
title_full GWIS - model-free, fast and exhaustive search for epistatic interactions in case-control GWAS
title_fullStr GWIS - model-free, fast and exhaustive search for epistatic interactions in case-control GWAS
title_full_unstemmed GWIS - model-free, fast and exhaustive search for epistatic interactions in case-control GWAS
title_short GWIS - model-free, fast and exhaustive search for epistatic interactions in case-control GWAS
title_sort gwis - model-free, fast and exhaustive search for epistatic interactions in case-control gwas
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3665501/
https://www.ncbi.nlm.nih.gov/pubmed/23819779
http://dx.doi.org/10.1186/1471-2164-14-S3-S10
work_keys_str_mv AT goudeybenjamin gwismodelfreefastandexhaustivesearchforepistaticinteractionsincasecontrolgwas
AT rawlinsondavid gwismodelfreefastandexhaustivesearchforepistaticinteractionsincasecontrolgwas
AT wangqiao gwismodelfreefastandexhaustivesearchforepistaticinteractionsincasecontrolgwas
AT shifan gwismodelfreefastandexhaustivesearchforepistaticinteractionsincasecontrolgwas
AT ferraherman gwismodelfreefastandexhaustivesearchforepistaticinteractionsincasecontrolgwas
AT campbellrichardm gwismodelfreefastandexhaustivesearchforepistaticinteractionsincasecontrolgwas
AT sternlinda gwismodelfreefastandexhaustivesearchforepistaticinteractionsincasecontrolgwas
AT inouyemichaelt gwismodelfreefastandexhaustivesearchforepistaticinteractionsincasecontrolgwas
AT ongchengsoon gwismodelfreefastandexhaustivesearchforepistaticinteractionsincasecontrolgwas
AT kowalczykadam gwismodelfreefastandexhaustivesearchforepistaticinteractionsincasecontrolgwas