Cargando…

GenEpi: gene-based epistasis discovery using machine learning

BACKGROUND: Genome-wide association studies (GWAS) provide a powerful means to identify associations between genetic variants and phenotypes. However, GWAS techniques for detecting epistasis, the interactions between genetic variants associated with phenotypes, are still limited. We believe that dev...

Descripción completa

Detalles Bibliográficos
Autores principales: Chang, Yu-Chuan, Wu, June-Tai, Hong, Ming-Yi, Tung, Yi-An, Hsieh, Ping-Han, Yee, Sook Wah, Giacomini, Kathleen M., Oyang, Yen-Jen, Chen, Chien-Yu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7041299/
https://www.ncbi.nlm.nih.gov/pubmed/32093643
http://dx.doi.org/10.1186/s12859-020-3368-2
_version_ 1783501134887911424
author Chang, Yu-Chuan
Wu, June-Tai
Hong, Ming-Yi
Tung, Yi-An
Hsieh, Ping-Han
Yee, Sook Wah
Giacomini, Kathleen M.
Oyang, Yen-Jen
Chen, Chien-Yu
author_facet Chang, Yu-Chuan
Wu, June-Tai
Hong, Ming-Yi
Tung, Yi-An
Hsieh, Ping-Han
Yee, Sook Wah
Giacomini, Kathleen M.
Oyang, Yen-Jen
Chen, Chien-Yu
author_sort Chang, Yu-Chuan
collection PubMed
description BACKGROUND: Genome-wide association studies (GWAS) provide a powerful means to identify associations between genetic variants and phenotypes. However, GWAS techniques for detecting epistasis, the interactions between genetic variants associated with phenotypes, are still limited. We believe that developing an efficient and effective GWAS method to detect epistasis will be a key for discovering sophisticated pathogenesis, which is especially important for complex diseases such as Alzheimer’s disease (AD). RESULTS: In this regard, this study presents GenEpi, a computational package to uncover epistasis associated with phenotypes by the proposed machine learning approach. GenEpi identifies both within-gene and cross-gene epistasis through a two-stage modeling workflow. In both stages, GenEpi adopts two-element combinatorial encoding when producing features and constructs the prediction models by L1-regularized regression with stability selection. The simulated data showed that GenEpi outperforms other widely-used methods on detecting the ground-truth epistasis. As real data is concerned, this study uses AD as an example to reveal the capability of GenEpi in finding disease-related variants and variant interactions that show both biological meanings and predictive power. CONCLUSIONS: The results on simulation data and AD demonstrated that GenEpi has the ability to detect the epistasis associated with phenotypes effectively and efficiently. The released package can be generalized to largely facilitate the studies of many complex diseases in the near future.
format Online
Article
Text
id pubmed-7041299
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-70412992020-03-03 GenEpi: gene-based epistasis discovery using machine learning Chang, Yu-Chuan Wu, June-Tai Hong, Ming-Yi Tung, Yi-An Hsieh, Ping-Han Yee, Sook Wah Giacomini, Kathleen M. Oyang, Yen-Jen Chen, Chien-Yu BMC Bioinformatics Software BACKGROUND: Genome-wide association studies (GWAS) provide a powerful means to identify associations between genetic variants and phenotypes. However, GWAS techniques for detecting epistasis, the interactions between genetic variants associated with phenotypes, are still limited. We believe that developing an efficient and effective GWAS method to detect epistasis will be a key for discovering sophisticated pathogenesis, which is especially important for complex diseases such as Alzheimer’s disease (AD). RESULTS: In this regard, this study presents GenEpi, a computational package to uncover epistasis associated with phenotypes by the proposed machine learning approach. GenEpi identifies both within-gene and cross-gene epistasis through a two-stage modeling workflow. In both stages, GenEpi adopts two-element combinatorial encoding when producing features and constructs the prediction models by L1-regularized regression with stability selection. The simulated data showed that GenEpi outperforms other widely-used methods on detecting the ground-truth epistasis. As real data is concerned, this study uses AD as an example to reveal the capability of GenEpi in finding disease-related variants and variant interactions that show both biological meanings and predictive power. CONCLUSIONS: The results on simulation data and AD demonstrated that GenEpi has the ability to detect the epistasis associated with phenotypes effectively and efficiently. The released package can be generalized to largely facilitate the studies of many complex diseases in the near future. BioMed Central 2020-02-24 /pmc/articles/PMC7041299/ /pubmed/32093643 http://dx.doi.org/10.1186/s12859-020-3368-2 Text en © The Author(s). 2020 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Chang, Yu-Chuan
Wu, June-Tai
Hong, Ming-Yi
Tung, Yi-An
Hsieh, Ping-Han
Yee, Sook Wah
Giacomini, Kathleen M.
Oyang, Yen-Jen
Chen, Chien-Yu
GenEpi: gene-based epistasis discovery using machine learning
title GenEpi: gene-based epistasis discovery using machine learning
title_full GenEpi: gene-based epistasis discovery using machine learning
title_fullStr GenEpi: gene-based epistasis discovery using machine learning
title_full_unstemmed GenEpi: gene-based epistasis discovery using machine learning
title_short GenEpi: gene-based epistasis discovery using machine learning
title_sort genepi: gene-based epistasis discovery using machine learning
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7041299/
https://www.ncbi.nlm.nih.gov/pubmed/32093643
http://dx.doi.org/10.1186/s12859-020-3368-2
work_keys_str_mv AT changyuchuan genepigenebasedepistasisdiscoveryusingmachinelearning
AT wujunetai genepigenebasedepistasisdiscoveryusingmachinelearning
AT hongmingyi genepigenebasedepistasisdiscoveryusingmachinelearning
AT tungyian genepigenebasedepistasisdiscoveryusingmachinelearning
AT hsiehpinghan genepigenebasedepistasisdiscoveryusingmachinelearning
AT yeesookwah genepigenebasedepistasisdiscoveryusingmachinelearning
AT giacominikathleenm genepigenebasedepistasisdiscoveryusingmachinelearning
AT oyangyenjen genepigenebasedepistasisdiscoveryusingmachinelearning
AT chenchienyu genepigenebasedepistasisdiscoveryusingmachinelearning
AT genepigenebasedepistasisdiscoveryusingmachinelearning