Cargando…

Identification of genetic interaction networks via an evolutionary algorithm evolved Bayesian network

BACKGROUND: The future of medicine is moving towards the phase of precision medicine, with the goal to prevent and treat diseases by taking inter-individual variability into account. A large part of the variability lies in our genetic makeup. With the fast paced improvement of high-throughput method...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Ruowang, Dudek, Scott M., Kim, Dokyoon, Hall, Molly A., Bradford, Yuki, Peissig, Peggy L., Brilliant, Murray H., Linneman, James G., McCarty, Catherine A., Bao, Le, Ritchie, Marylyn D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4862166/
https://www.ncbi.nlm.nih.gov/pubmed/27168765
http://dx.doi.org/10.1186/s13040-016-0094-4
_version_ 1782431320132026368
author Li, Ruowang
Dudek, Scott M.
Kim, Dokyoon
Hall, Molly A.
Bradford, Yuki
Peissig, Peggy L.
Brilliant, Murray H.
Linneman, James G.
McCarty, Catherine A.
Bao, Le
Ritchie, Marylyn D.
author_facet Li, Ruowang
Dudek, Scott M.
Kim, Dokyoon
Hall, Molly A.
Bradford, Yuki
Peissig, Peggy L.
Brilliant, Murray H.
Linneman, James G.
McCarty, Catherine A.
Bao, Le
Ritchie, Marylyn D.
author_sort Li, Ruowang
collection PubMed
description BACKGROUND: The future of medicine is moving towards the phase of precision medicine, with the goal to prevent and treat diseases by taking inter-individual variability into account. A large part of the variability lies in our genetic makeup. With the fast paced improvement of high-throughput methods for genome sequencing, a tremendous amount of genetics data have already been generated. The next hurdle for precision medicine is to have sufficient computational tools for analyzing large sets of data. Genome-Wide Association Studies (GWAS) have been the primary method to assess the relationship between single nucleotide polymorphisms (SNPs) and disease traits. While GWAS is sufficient in finding individual SNPs with strong main effects, it does not capture potential interactions among multiple SNPs. In many traits, a large proportion of variation remain unexplained by using main effects alone, leaving the door open for exploring the role of genetic interactions. However, identifying genetic interactions in large-scale genomics data poses a challenge even for modern computing. RESULTS: For this study, we present a new algorithm, Grammatical Evolution Bayesian Network (GEBN) that utilizes Bayesian Networks to identify interactions in the data, and at the same time, uses an evolutionary algorithm to reduce the computational cost associated with network optimization. GEBN excelled in simulation studies where the data contained main effects and interaction effects. We also applied GEBN to a Type 2 diabetes (T2D) dataset obtained from the Marshfield Personalized Medicine Research Project (PMRP). We were able to identify genetic interactions for T2D cases and controls and use information from those interactions to classify T2D samples. We obtained an average testing area under the curve (AUC) of 86.8 %. We also identified several interacting genes such as INADL and LPP that are known to be associated with T2D. CONCLUSIONS: Developing the computational tools to explore genetic associations beyond main effects remains a critically important challenge in human genetics. Methods, such as GEBN, demonstrate the utility of considering genetic interactions, as they likely explain some of the missing heritability.
format Online
Article
Text
id pubmed-4862166
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-48621662016-05-11 Identification of genetic interaction networks via an evolutionary algorithm evolved Bayesian network Li, Ruowang Dudek, Scott M. Kim, Dokyoon Hall, Molly A. Bradford, Yuki Peissig, Peggy L. Brilliant, Murray H. Linneman, James G. McCarty, Catherine A. Bao, Le Ritchie, Marylyn D. BioData Min Methodology BACKGROUND: The future of medicine is moving towards the phase of precision medicine, with the goal to prevent and treat diseases by taking inter-individual variability into account. A large part of the variability lies in our genetic makeup. With the fast paced improvement of high-throughput methods for genome sequencing, a tremendous amount of genetics data have already been generated. The next hurdle for precision medicine is to have sufficient computational tools for analyzing large sets of data. Genome-Wide Association Studies (GWAS) have been the primary method to assess the relationship between single nucleotide polymorphisms (SNPs) and disease traits. While GWAS is sufficient in finding individual SNPs with strong main effects, it does not capture potential interactions among multiple SNPs. In many traits, a large proportion of variation remain unexplained by using main effects alone, leaving the door open for exploring the role of genetic interactions. However, identifying genetic interactions in large-scale genomics data poses a challenge even for modern computing. RESULTS: For this study, we present a new algorithm, Grammatical Evolution Bayesian Network (GEBN) that utilizes Bayesian Networks to identify interactions in the data, and at the same time, uses an evolutionary algorithm to reduce the computational cost associated with network optimization. GEBN excelled in simulation studies where the data contained main effects and interaction effects. We also applied GEBN to a Type 2 diabetes (T2D) dataset obtained from the Marshfield Personalized Medicine Research Project (PMRP). We were able to identify genetic interactions for T2D cases and controls and use information from those interactions to classify T2D samples. We obtained an average testing area under the curve (AUC) of 86.8 %. We also identified several interacting genes such as INADL and LPP that are known to be associated with T2D. CONCLUSIONS: Developing the computational tools to explore genetic associations beyond main effects remains a critically important challenge in human genetics. Methods, such as GEBN, demonstrate the utility of considering genetic interactions, as they likely explain some of the missing heritability. BioMed Central 2016-05-10 /pmc/articles/PMC4862166/ /pubmed/27168765 http://dx.doi.org/10.1186/s13040-016-0094-4 Text en © Li et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Li, Ruowang
Dudek, Scott M.
Kim, Dokyoon
Hall, Molly A.
Bradford, Yuki
Peissig, Peggy L.
Brilliant, Murray H.
Linneman, James G.
McCarty, Catherine A.
Bao, Le
Ritchie, Marylyn D.
Identification of genetic interaction networks via an evolutionary algorithm evolved Bayesian network
title Identification of genetic interaction networks via an evolutionary algorithm evolved Bayesian network
title_full Identification of genetic interaction networks via an evolutionary algorithm evolved Bayesian network
title_fullStr Identification of genetic interaction networks via an evolutionary algorithm evolved Bayesian network
title_full_unstemmed Identification of genetic interaction networks via an evolutionary algorithm evolved Bayesian network
title_short Identification of genetic interaction networks via an evolutionary algorithm evolved Bayesian network
title_sort identification of genetic interaction networks via an evolutionary algorithm evolved bayesian network
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4862166/
https://www.ncbi.nlm.nih.gov/pubmed/27168765
http://dx.doi.org/10.1186/s13040-016-0094-4
work_keys_str_mv AT liruowang identificationofgeneticinteractionnetworksviaanevolutionaryalgorithmevolvedbayesiannetwork
AT dudekscottm identificationofgeneticinteractionnetworksviaanevolutionaryalgorithmevolvedbayesiannetwork
AT kimdokyoon identificationofgeneticinteractionnetworksviaanevolutionaryalgorithmevolvedbayesiannetwork
AT hallmollya identificationofgeneticinteractionnetworksviaanevolutionaryalgorithmevolvedbayesiannetwork
AT bradfordyuki identificationofgeneticinteractionnetworksviaanevolutionaryalgorithmevolvedbayesiannetwork
AT peissigpeggyl identificationofgeneticinteractionnetworksviaanevolutionaryalgorithmevolvedbayesiannetwork
AT brilliantmurrayh identificationofgeneticinteractionnetworksviaanevolutionaryalgorithmevolvedbayesiannetwork
AT linnemanjamesg identificationofgeneticinteractionnetworksviaanevolutionaryalgorithmevolvedbayesiannetwork
AT mccartycatherinea identificationofgeneticinteractionnetworksviaanevolutionaryalgorithmevolvedbayesiannetwork
AT baole identificationofgeneticinteractionnetworksviaanevolutionaryalgorithmevolvedbayesiannetwork
AT ritchiemarylynd identificationofgeneticinteractionnetworksviaanevolutionaryalgorithmevolvedbayesiannetwork