Cargando…

High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies

Genome-wide association studies (GWAS) are a common approach for systematic discovery of single nucleotide polymorphisms (SNPs) which are associated with a given disease. Univariate analysis approaches commonly employed may miss important SNP associations that only appear through multivariate analys...

Descripción completa

Detalles Bibliográficos
Autores principales: Goudey, Benjamin, Abedini, Mani, Hopper, John L, Inouye, Michael, Makalic, Enes, Schmidt, Daniel F, Wagner, John, Zhou, Zeyu, Zobel, Justin, Reumann, Matthias
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4383059/
https://www.ncbi.nlm.nih.gov/pubmed/25870758
http://dx.doi.org/10.1186/2047-2501-3-S1-S3
_version_ 1782364671541510144
author Goudey, Benjamin
Abedini, Mani
Hopper, John L
Inouye, Michael
Makalic, Enes
Schmidt, Daniel F
Wagner, John
Zhou, Zeyu
Zobel, Justin
Reumann, Matthias
author_facet Goudey, Benjamin
Abedini, Mani
Hopper, John L
Inouye, Michael
Makalic, Enes
Schmidt, Daniel F
Wagner, John
Zhou, Zeyu
Zobel, Justin
Reumann, Matthias
author_sort Goudey, Benjamin
collection PubMed
description Genome-wide association studies (GWAS) are a common approach for systematic discovery of single nucleotide polymorphisms (SNPs) which are associated with a given disease. Univariate analysis approaches commonly employed may miss important SNP associations that only appear through multivariate analysis in complex diseases. However, multivariate SNP analysis is currently limited by its inherent computational complexity. In this work, we present a computational framework that harnesses supercomputers. Based on our results, we estimate a three-way interaction analysis on 1.1 million SNP GWAS data requiring over 5.8 years on the full "Avoca" IBM Blue Gene/Q installation at the Victorian Life Sciences Computation Initiative. This is hundreds of times faster than estimates for other CPU based methods and four times faster than runtimes estimated for GPU methods, indicating how the improvement in the level of hardware applied to interaction analysis may alter the types of analysis that can be performed. Furthermore, the same analysis would take under 3 months on the currently largest IBM Blue Gene/Q supercomputer "Sequoia" at the Lawrence Livermore National Laboratory assuming linear scaling is maintained as our results suggest. Given that the implementation used in this study can be further optimised, this runtime means it is becoming feasible to carry out exhaustive analysis of higher order interaction studies on large modern GWAS.
format Online
Article
Text
id pubmed-4383059
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43830592015-04-13 High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies Goudey, Benjamin Abedini, Mani Hopper, John L Inouye, Michael Makalic, Enes Schmidt, Daniel F Wagner, John Zhou, Zeyu Zobel, Justin Reumann, Matthias Health Inf Sci Syst Research Genome-wide association studies (GWAS) are a common approach for systematic discovery of single nucleotide polymorphisms (SNPs) which are associated with a given disease. Univariate analysis approaches commonly employed may miss important SNP associations that only appear through multivariate analysis in complex diseases. However, multivariate SNP analysis is currently limited by its inherent computational complexity. In this work, we present a computational framework that harnesses supercomputers. Based on our results, we estimate a three-way interaction analysis on 1.1 million SNP GWAS data requiring over 5.8 years on the full "Avoca" IBM Blue Gene/Q installation at the Victorian Life Sciences Computation Initiative. This is hundreds of times faster than estimates for other CPU based methods and four times faster than runtimes estimated for GPU methods, indicating how the improvement in the level of hardware applied to interaction analysis may alter the types of analysis that can be performed. Furthermore, the same analysis would take under 3 months on the currently largest IBM Blue Gene/Q supercomputer "Sequoia" at the Lawrence Livermore National Laboratory assuming linear scaling is maintained as our results suggest. Given that the implementation used in this study can be further optimised, this runtime means it is becoming feasible to carry out exhaustive analysis of higher order interaction studies on large modern GWAS. BioMed Central 2015-02-24 /pmc/articles/PMC4383059/ /pubmed/25870758 http://dx.doi.org/10.1186/2047-2501-3-S1-S3 Text en Copyright © 2015 Goudey et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Goudey, Benjamin
Abedini, Mani
Hopper, John L
Inouye, Michael
Makalic, Enes
Schmidt, Daniel F
Wagner, John
Zhou, Zeyu
Zobel, Justin
Reumann, Matthias
High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies
title High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies
title_full High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies
title_fullStr High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies
title_full_unstemmed High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies
title_short High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies
title_sort high performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in genome wide association studies
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4383059/
https://www.ncbi.nlm.nih.gov/pubmed/25870758
http://dx.doi.org/10.1186/2047-2501-3-S1-S3
work_keys_str_mv AT goudeybenjamin highperformancecomputingenablingexhaustiveanalysisofhigherordersinglenucleotidepolymorphisminteractioningenomewideassociationstudies
AT abedinimani highperformancecomputingenablingexhaustiveanalysisofhigherordersinglenucleotidepolymorphisminteractioningenomewideassociationstudies
AT hopperjohnl highperformancecomputingenablingexhaustiveanalysisofhigherordersinglenucleotidepolymorphisminteractioningenomewideassociationstudies
AT inouyemichael highperformancecomputingenablingexhaustiveanalysisofhigherordersinglenucleotidepolymorphisminteractioningenomewideassociationstudies
AT makalicenes highperformancecomputingenablingexhaustiveanalysisofhigherordersinglenucleotidepolymorphisminteractioningenomewideassociationstudies
AT schmidtdanielf highperformancecomputingenablingexhaustiveanalysisofhigherordersinglenucleotidepolymorphisminteractioningenomewideassociationstudies
AT wagnerjohn highperformancecomputingenablingexhaustiveanalysisofhigherordersinglenucleotidepolymorphisminteractioningenomewideassociationstudies
AT zhouzeyu highperformancecomputingenablingexhaustiveanalysisofhigherordersinglenucleotidepolymorphisminteractioningenomewideassociationstudies
AT zobeljustin highperformancecomputingenablingexhaustiveanalysisofhigherordersinglenucleotidepolymorphisminteractioningenomewideassociationstudies
AT reumannmatthias highperformancecomputingenablingexhaustiveanalysisofhigherordersinglenucleotidepolymorphisminteractioningenomewideassociationstudies