Cargando…

Prioritizing tests of epistasis through hierarchical representation of genomic redundancies

Epistasis is defined as a statistical interaction between two or more genomic loci in terms of their association with a phenotype of interest. Epistatic loci that are identified using data from Genome-Wide Association Studies (GWAS) provide insights into the interplay among multiple genetic factors,...

Descripción completa

Detalles Bibliográficos
Autores principales:	Cowman, Tyler, Koyutürk, Mehmet
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2017
Materias:	Methods Online
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5737499/ https://www.ncbi.nlm.nih.gov/pubmed/28605458 http://dx.doi.org/10.1093/nar/gkx505

_version_	1783287529370288128
author	Cowman, Tyler Koyutürk, Mehmet
author_facet	Cowman, Tyler Koyutürk, Mehmet
author_sort	Cowman, Tyler
collection	PubMed
description	Epistasis is defined as a statistical interaction between two or more genomic loci in terms of their association with a phenotype of interest. Epistatic loci that are identified using data from Genome-Wide Association Studies (GWAS) provide insights into the interplay among multiple genetic factors, with applications including assessment of susceptibility to complex diseases, decision making in precision medicine, and gaining insights into disease mechanisms. Since the number of genomic loci assayed by GWAS is extremely large (usually in the order of millions), identification of epistatic loci is a statistically difficult and computationally intensive problem. Even when only pairwise interactions are considered, the size of the search space ranges from hundreds of millions to billions of locus pairs. The large number of statistical tests performed also makes sufficient type one error correction imperative. Consequently, efficient algorithms are required to filter the tests that are performed and evaluate large GWAS data sets in a reasonable amount of computation time. It has been observed that many pairwise tests are redundant due to correlations in their genotype values across samples, known as linkage disequilibrium. However, algorithms that have been developed for efficient identification of epistatic loci do not systematically exploit linkage disequilibrium. Here, we propose a new algorithm for fast epistasis detection based on hierarchical representation of linkage disequilibrium (LinDen). We utilize redundancies in genotype patterns between neighboring loci to generate a hierarchical structure and execute a branch-and-bound search to prioritize loci testing based on approximations of a test statistic for pairs of locus groups. The hierarchical organization of tests performed by LinDen allows for efficient scaling based on the screened loci. We test LinDen comprehensively on three data sets obtained from the Wellcome Trust Case Control Consortium: type two diabetes, psoriasis, and hypertension. Our results show that, as compared other state-of-the-art tools for fast epistasis detection, LinDen drastically reduces the number of tests performed while discovering statistically significant locus pairs. LinDen is implemented in C++ and is available as open source at http://compbio.case.edu/linden/.
format	Online Article Text
id	pubmed-5737499
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-57374992018-01-09 Prioritizing tests of epistasis through hierarchical representation of genomic redundancies Cowman, Tyler Koyutürk, Mehmet Nucleic Acids Res Methods Online Epistasis is defined as a statistical interaction between two or more genomic loci in terms of their association with a phenotype of interest. Epistatic loci that are identified using data from Genome-Wide Association Studies (GWAS) provide insights into the interplay among multiple genetic factors, with applications including assessment of susceptibility to complex diseases, decision making in precision medicine, and gaining insights into disease mechanisms. Since the number of genomic loci assayed by GWAS is extremely large (usually in the order of millions), identification of epistatic loci is a statistically difficult and computationally intensive problem. Even when only pairwise interactions are considered, the size of the search space ranges from hundreds of millions to billions of locus pairs. The large number of statistical tests performed also makes sufficient type one error correction imperative. Consequently, efficient algorithms are required to filter the tests that are performed and evaluate large GWAS data sets in a reasonable amount of computation time. It has been observed that many pairwise tests are redundant due to correlations in their genotype values across samples, known as linkage disequilibrium. However, algorithms that have been developed for efficient identification of epistatic loci do not systematically exploit linkage disequilibrium. Here, we propose a new algorithm for fast epistasis detection based on hierarchical representation of linkage disequilibrium (LinDen). We utilize redundancies in genotype patterns between neighboring loci to generate a hierarchical structure and execute a branch-and-bound search to prioritize loci testing based on approximations of a test statistic for pairs of locus groups. The hierarchical organization of tests performed by LinDen allows for efficient scaling based on the screened loci. We test LinDen comprehensively on three data sets obtained from the Wellcome Trust Case Control Consortium: type two diabetes, psoriasis, and hypertension. Our results show that, as compared other state-of-the-art tools for fast epistasis detection, LinDen drastically reduces the number of tests performed while discovering statistically significant locus pairs. LinDen is implemented in C++ and is available as open source at http://compbio.case.edu/linden/. Oxford University Press 2017-08-21 2017-06-09 /pmc/articles/PMC5737499/ /pubmed/28605458 http://dx.doi.org/10.1093/nar/gkx505 Text en © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Methods Online Cowman, Tyler Koyutürk, Mehmet Prioritizing tests of epistasis through hierarchical representation of genomic redundancies
title	Prioritizing tests of epistasis through hierarchical representation of genomic redundancies
title_full	Prioritizing tests of epistasis through hierarchical representation of genomic redundancies
title_fullStr	Prioritizing tests of epistasis through hierarchical representation of genomic redundancies
title_full_unstemmed	Prioritizing tests of epistasis through hierarchical representation of genomic redundancies
title_short	Prioritizing tests of epistasis through hierarchical representation of genomic redundancies
title_sort	prioritizing tests of epistasis through hierarchical representation of genomic redundancies
topic	Methods Online
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5737499/ https://www.ncbi.nlm.nih.gov/pubmed/28605458 http://dx.doi.org/10.1093/nar/gkx505
work_keys_str_mv	AT cowmantyler prioritizingtestsofepistasisthroughhierarchicalrepresentationofgenomicredundancies AT koyuturkmehmet prioritizingtestsofepistasisthroughhierarchicalrepresentationofgenomicredundancies

Prioritizing tests of epistasis through hierarchical representation of genomic redundancies

Ejemplares similares