Cargando…

Using tree-based methods for detection of gene–gene interactions in the presence of a polygenic signal: simulation study with application to educational attainment in the Generation Scotland Cohort Study

MOTIVATION: The genomic architecture of human complex diseases is thought to be attributable to single markers, polygenic components and epistatic components. No study has examined the ability of tree-based methods to detect epistasis in the presence of a polygenic signal. We sought to apply decisio...

Descripción completa

Detalles Bibliográficos
Autores principales: Meijsen, Joeri J, Rammos, Alexandros, Campbell, Archie, Hayward, Caroline, Porteous, David J, Deary, Ian J, Marioni, Riccardo E, Nicodemus, Kristin K
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6330004/
https://www.ncbi.nlm.nih.gov/pubmed/29931044
http://dx.doi.org/10.1093/bioinformatics/bty462
_version_ 1783386909761863680
author Meijsen, Joeri J
Rammos, Alexandros
Campbell, Archie
Hayward, Caroline
Porteous, David J
Deary, Ian J
Marioni, Riccardo E
Nicodemus, Kristin K
author_facet Meijsen, Joeri J
Rammos, Alexandros
Campbell, Archie
Hayward, Caroline
Porteous, David J
Deary, Ian J
Marioni, Riccardo E
Nicodemus, Kristin K
author_sort Meijsen, Joeri J
collection PubMed
description MOTIVATION: The genomic architecture of human complex diseases is thought to be attributable to single markers, polygenic components and epistatic components. No study has examined the ability of tree-based methods to detect epistasis in the presence of a polygenic signal. We sought to apply decision tree-based methods, C5.0 and logic regression, to detect epistasis under several simulated conditions, varying strength of interaction and linkage disequilibrium (LD) structure. We then applied the same methods to the phenotype of educational attainment in a large population cohort. RESULTS: LD pruning improved the power and reduced the type I error. C5.0 had a conservative type I error rate whereas logic regression had a type I error rate that exceeded 5%. Despite the more conservative type I error, C5.0 was observed to have higher power than logic regression across several conditions. In the presence of a polygenic signal, power was generally reduced. Applying both methods on educational attainment in a large population cohort yielded numerous interacting SNPs; notably a SNP in RCAN3 which is associated with reading and spelling and a SNP in NPAS3, a neurodevelopmental gene. AVAILABILITY AND IMPLEMENTATION: All methods used are implemented and freely available in R. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-6330004
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-63300042019-01-15 Using tree-based methods for detection of gene–gene interactions in the presence of a polygenic signal: simulation study with application to educational attainment in the Generation Scotland Cohort Study Meijsen, Joeri J Rammos, Alexandros Campbell, Archie Hayward, Caroline Porteous, David J Deary, Ian J Marioni, Riccardo E Nicodemus, Kristin K Bioinformatics Original Papers MOTIVATION: The genomic architecture of human complex diseases is thought to be attributable to single markers, polygenic components and epistatic components. No study has examined the ability of tree-based methods to detect epistasis in the presence of a polygenic signal. We sought to apply decision tree-based methods, C5.0 and logic regression, to detect epistasis under several simulated conditions, varying strength of interaction and linkage disequilibrium (LD) structure. We then applied the same methods to the phenotype of educational attainment in a large population cohort. RESULTS: LD pruning improved the power and reduced the type I error. C5.0 had a conservative type I error rate whereas logic regression had a type I error rate that exceeded 5%. Despite the more conservative type I error, C5.0 was observed to have higher power than logic regression across several conditions. In the presence of a polygenic signal, power was generally reduced. Applying both methods on educational attainment in a large population cohort yielded numerous interacting SNPs; notably a SNP in RCAN3 which is associated with reading and spelling and a SNP in NPAS3, a neurodevelopmental gene. AVAILABILITY AND IMPLEMENTATION: All methods used are implemented and freely available in R. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-01-15 2018-06-19 /pmc/articles/PMC6330004/ /pubmed/29931044 http://dx.doi.org/10.1093/bioinformatics/bty462 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Meijsen, Joeri J
Rammos, Alexandros
Campbell, Archie
Hayward, Caroline
Porteous, David J
Deary, Ian J
Marioni, Riccardo E
Nicodemus, Kristin K
Using tree-based methods for detection of gene–gene interactions in the presence of a polygenic signal: simulation study with application to educational attainment in the Generation Scotland Cohort Study
title Using tree-based methods for detection of gene–gene interactions in the presence of a polygenic signal: simulation study with application to educational attainment in the Generation Scotland Cohort Study
title_full Using tree-based methods for detection of gene–gene interactions in the presence of a polygenic signal: simulation study with application to educational attainment in the Generation Scotland Cohort Study
title_fullStr Using tree-based methods for detection of gene–gene interactions in the presence of a polygenic signal: simulation study with application to educational attainment in the Generation Scotland Cohort Study
title_full_unstemmed Using tree-based methods for detection of gene–gene interactions in the presence of a polygenic signal: simulation study with application to educational attainment in the Generation Scotland Cohort Study
title_short Using tree-based methods for detection of gene–gene interactions in the presence of a polygenic signal: simulation study with application to educational attainment in the Generation Scotland Cohort Study
title_sort using tree-based methods for detection of gene–gene interactions in the presence of a polygenic signal: simulation study with application to educational attainment in the generation scotland cohort study
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6330004/
https://www.ncbi.nlm.nih.gov/pubmed/29931044
http://dx.doi.org/10.1093/bioinformatics/bty462
work_keys_str_mv AT meijsenjoerij usingtreebasedmethodsfordetectionofgenegeneinteractionsinthepresenceofapolygenicsignalsimulationstudywithapplicationtoeducationalattainmentinthegenerationscotlandcohortstudy
AT rammosalexandros usingtreebasedmethodsfordetectionofgenegeneinteractionsinthepresenceofapolygenicsignalsimulationstudywithapplicationtoeducationalattainmentinthegenerationscotlandcohortstudy
AT campbellarchie usingtreebasedmethodsfordetectionofgenegeneinteractionsinthepresenceofapolygenicsignalsimulationstudywithapplicationtoeducationalattainmentinthegenerationscotlandcohortstudy
AT haywardcaroline usingtreebasedmethodsfordetectionofgenegeneinteractionsinthepresenceofapolygenicsignalsimulationstudywithapplicationtoeducationalattainmentinthegenerationscotlandcohortstudy
AT porteousdavidj usingtreebasedmethodsfordetectionofgenegeneinteractionsinthepresenceofapolygenicsignalsimulationstudywithapplicationtoeducationalattainmentinthegenerationscotlandcohortstudy
AT dearyianj usingtreebasedmethodsfordetectionofgenegeneinteractionsinthepresenceofapolygenicsignalsimulationstudywithapplicationtoeducationalattainmentinthegenerationscotlandcohortstudy
AT marioniriccardoe usingtreebasedmethodsfordetectionofgenegeneinteractionsinthepresenceofapolygenicsignalsimulationstudywithapplicationtoeducationalattainmentinthegenerationscotlandcohortstudy
AT nicodemuskristink usingtreebasedmethodsfordetectionofgenegeneinteractionsinthepresenceofapolygenicsignalsimulationstudywithapplicationtoeducationalattainmentinthegenerationscotlandcohortstudy