Cargando…

Imperfect Linkage Disequilibrium Generates Phantom Epistasis (& Perils of Big Data)

The genetic architecture of complex human traits and diseases is affected by large number of possibly interacting genes, but detecting epistatic interactions can be challenging. In the last decade, several studies have alluded to problems that linkage disequilibrium can create when testing for epist...

Descripción completa

Detalles Bibliográficos
Autores principales: de los Campos, Gustavo, Sorensen, Daniel Alberto, Toro, Miguel Angel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Genetics Society of America 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6505142/
https://www.ncbi.nlm.nih.gov/pubmed/30877081
http://dx.doi.org/10.1534/g3.119.400101
_version_ 1783416699008057344
author de los Campos, Gustavo
Sorensen, Daniel Alberto
Toro, Miguel Angel
author_facet de los Campos, Gustavo
Sorensen, Daniel Alberto
Toro, Miguel Angel
author_sort de los Campos, Gustavo
collection PubMed
description The genetic architecture of complex human traits and diseases is affected by large number of possibly interacting genes, but detecting epistatic interactions can be challenging. In the last decade, several studies have alluded to problems that linkage disequilibrium can create when testing for epistatic interactions between DNA markers. However, these problems have not been formalized nor have their consequences been quantified in a precise manner. Here we use a conceptually simple three locus model involving a causal locus and two markers to show that imperfect LD can generate the illusion of epistasis, even when the underlying genetic architecture is purely additive. We describe necessary conditions for such “phantom epistasis” to emerge and quantify its relevance using simulations. Our empirical results demonstrate that phantom epistasis can be a very serious problem in GWAS studies (with rejection rates against the additive model greater than 0.28 for nominal p-values of 0.05, even when the model is purely additive). Some studies have sought to avoid this problem by only testing interactions between SNPs with R-sq. <0.1. We show that this threshold is not appropriate and demonstrate that the magnitude of the problem is even greater with large sample size, intermediate allele frequencies, and when the causal locus explains a large amount of phenotypic variance. We conclude that caution must be exercised when interpreting GWAS results derived from very large data sets showing strong evidence in support of epistatic interactions between markers.
format Online
Article
Text
id pubmed-6505142
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Genetics Society of America
record_format MEDLINE/PubMed
spelling pubmed-65051422019-05-21 Imperfect Linkage Disequilibrium Generates Phantom Epistasis (& Perils of Big Data) de los Campos, Gustavo Sorensen, Daniel Alberto Toro, Miguel Angel G3 (Bethesda) Investigations The genetic architecture of complex human traits and diseases is affected by large number of possibly interacting genes, but detecting epistatic interactions can be challenging. In the last decade, several studies have alluded to problems that linkage disequilibrium can create when testing for epistatic interactions between DNA markers. However, these problems have not been formalized nor have their consequences been quantified in a precise manner. Here we use a conceptually simple three locus model involving a causal locus and two markers to show that imperfect LD can generate the illusion of epistasis, even when the underlying genetic architecture is purely additive. We describe necessary conditions for such “phantom epistasis” to emerge and quantify its relevance using simulations. Our empirical results demonstrate that phantom epistasis can be a very serious problem in GWAS studies (with rejection rates against the additive model greater than 0.28 for nominal p-values of 0.05, even when the model is purely additive). Some studies have sought to avoid this problem by only testing interactions between SNPs with R-sq. <0.1. We show that this threshold is not appropriate and demonstrate that the magnitude of the problem is even greater with large sample size, intermediate allele frequencies, and when the causal locus explains a large amount of phenotypic variance. We conclude that caution must be exercised when interpreting GWAS results derived from very large data sets showing strong evidence in support of epistatic interactions between markers. Genetics Society of America 2019-03-21 /pmc/articles/PMC6505142/ /pubmed/30877081 http://dx.doi.org/10.1534/g3.119.400101 Text en Copyright © 2019 de los Campos et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Investigations
de los Campos, Gustavo
Sorensen, Daniel Alberto
Toro, Miguel Angel
Imperfect Linkage Disequilibrium Generates Phantom Epistasis (& Perils of Big Data)
title Imperfect Linkage Disequilibrium Generates Phantom Epistasis (& Perils of Big Data)
title_full Imperfect Linkage Disequilibrium Generates Phantom Epistasis (& Perils of Big Data)
title_fullStr Imperfect Linkage Disequilibrium Generates Phantom Epistasis (& Perils of Big Data)
title_full_unstemmed Imperfect Linkage Disequilibrium Generates Phantom Epistasis (& Perils of Big Data)
title_short Imperfect Linkage Disequilibrium Generates Phantom Epistasis (& Perils of Big Data)
title_sort imperfect linkage disequilibrium generates phantom epistasis (& perils of big data)
topic Investigations
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6505142/
https://www.ncbi.nlm.nih.gov/pubmed/30877081
http://dx.doi.org/10.1534/g3.119.400101
work_keys_str_mv AT deloscamposgustavo imperfectlinkagedisequilibriumgeneratesphantomepistasisperilsofbigdata
AT sorensendanielalberto imperfectlinkagedisequilibriumgeneratesphantomepistasisperilsofbigdata
AT toromiguelangel imperfectlinkagedisequilibriumgeneratesphantomepistasisperilsofbigdata