Cargando…
New Algorithm and Software (BNOmics) for Inferring and Visualizing Bayesian Networks from Heterogeneous Big Biological and Genetic Data
Bayesian network (BN) reconstruction is a prototypical systems biology data analysis approach that has been successfully used to reverse engineer and model networks reflecting different layers of biological organization (ranging from genetic to epigenetic to cellular pathway to metabolomic). It is e...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Mary Ann Liebert, Inc.
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5372779/ https://www.ncbi.nlm.nih.gov/pubmed/27681505 http://dx.doi.org/10.1089/cmb.2016.0100 |
_version_ | 1782518688224641024 |
---|---|
author | Gogoshin, Grigoriy Boerwinkle, Eric Rodin, Andrei S. |
author_facet | Gogoshin, Grigoriy Boerwinkle, Eric Rodin, Andrei S. |
author_sort | Gogoshin, Grigoriy |
collection | PubMed |
description | Bayesian network (BN) reconstruction is a prototypical systems biology data analysis approach that has been successfully used to reverse engineer and model networks reflecting different layers of biological organization (ranging from genetic to epigenetic to cellular pathway to metabolomic). It is especially relevant in the context of modern (ongoing and prospective) studies that generate heterogeneous high-throughput omics datasets. However, there are both theoretical and practical obstacles to the seamless application of BN modeling to such big data, including computational inefficiency of optimal BN structure search algorithms, ambiguity in data discretization, mixing data types, imputation and validation, and, in general, limited scalability in both reconstruction and visualization of BNs. To overcome these and other obstacles, we present BNOmics, an improved algorithm and software toolkit for inferring and analyzing BNs from omics datasets. BNOmics aims at comprehensive systems biology—type data exploration, including both generating new biological hypothesis and testing and validating the existing ones. Novel aspects of the algorithm center around increasing scalability and applicability to varying data types (with different explicit and implicit distributional assumptions) within the same analysis framework. An output and visualization interface to widely available graph-rendering software is also included. Three diverse applications are detailed. BNOmics was originally developed in the context of genetic epidemiology data and is being continuously optimized to keep pace with the ever-increasing inflow of available large-scale omics datasets. As such, the software scalability and usability on the less than exotic computer hardware are a priority, as well as the applicability of the algorithm and software to the heterogeneous datasets containing many data types—single-nucleotide polymorphisms and other genetic/epigenetic/transcriptome variables, metabolite levels, epidemiological variables, endpoints, and phenotypes, etc. |
format | Online Article Text |
id | pubmed-5372779 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Mary Ann Liebert, Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-53727792017-05-03 New Algorithm and Software (BNOmics) for Inferring and Visualizing Bayesian Networks from Heterogeneous Big Biological and Genetic Data Gogoshin, Grigoriy Boerwinkle, Eric Rodin, Andrei S. J Comput Biol Research Articles Bayesian network (BN) reconstruction is a prototypical systems biology data analysis approach that has been successfully used to reverse engineer and model networks reflecting different layers of biological organization (ranging from genetic to epigenetic to cellular pathway to metabolomic). It is especially relevant in the context of modern (ongoing and prospective) studies that generate heterogeneous high-throughput omics datasets. However, there are both theoretical and practical obstacles to the seamless application of BN modeling to such big data, including computational inefficiency of optimal BN structure search algorithms, ambiguity in data discretization, mixing data types, imputation and validation, and, in general, limited scalability in both reconstruction and visualization of BNs. To overcome these and other obstacles, we present BNOmics, an improved algorithm and software toolkit for inferring and analyzing BNs from omics datasets. BNOmics aims at comprehensive systems biology—type data exploration, including both generating new biological hypothesis and testing and validating the existing ones. Novel aspects of the algorithm center around increasing scalability and applicability to varying data types (with different explicit and implicit distributional assumptions) within the same analysis framework. An output and visualization interface to widely available graph-rendering software is also included. Three diverse applications are detailed. BNOmics was originally developed in the context of genetic epidemiology data and is being continuously optimized to keep pace with the ever-increasing inflow of available large-scale omics datasets. As such, the software scalability and usability on the less than exotic computer hardware are a priority, as well as the applicability of the algorithm and software to the heterogeneous datasets containing many data types—single-nucleotide polymorphisms and other genetic/epigenetic/transcriptome variables, metabolite levels, epidemiological variables, endpoints, and phenotypes, etc. Mary Ann Liebert, Inc. 2017-04-01 2017-04-01 /pmc/articles/PMC5372779/ /pubmed/27681505 http://dx.doi.org/10.1089/cmb.2016.0100 Text en © Grigoriy Gogoshin, et al., 2016. Published by Mary Ann Liebert, Inc. This Open Access article is distributed under the terms of the Creative Commons Attribution Noncommercial License (http://creativecommons.org/license/by-nc/4.0/) which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited. |
spellingShingle | Research Articles Gogoshin, Grigoriy Boerwinkle, Eric Rodin, Andrei S. New Algorithm and Software (BNOmics) for Inferring and Visualizing Bayesian Networks from Heterogeneous Big Biological and Genetic Data |
title | New Algorithm and Software (BNOmics) for Inferring and Visualizing Bayesian Networks from Heterogeneous Big Biological and Genetic Data |
title_full | New Algorithm and Software (BNOmics) for Inferring and Visualizing Bayesian Networks from Heterogeneous Big Biological and Genetic Data |
title_fullStr | New Algorithm and Software (BNOmics) for Inferring and Visualizing Bayesian Networks from Heterogeneous Big Biological and Genetic Data |
title_full_unstemmed | New Algorithm and Software (BNOmics) for Inferring and Visualizing Bayesian Networks from Heterogeneous Big Biological and Genetic Data |
title_short | New Algorithm and Software (BNOmics) for Inferring and Visualizing Bayesian Networks from Heterogeneous Big Biological and Genetic Data |
title_sort | new algorithm and software (bnomics) for inferring and visualizing bayesian networks from heterogeneous big biological and genetic data |
topic | Research Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5372779/ https://www.ncbi.nlm.nih.gov/pubmed/27681505 http://dx.doi.org/10.1089/cmb.2016.0100 |
work_keys_str_mv | AT gogoshingrigoriy newalgorithmandsoftwarebnomicsforinferringandvisualizingbayesiannetworksfromheterogeneousbigbiologicalandgeneticdata AT boerwinkleeric newalgorithmandsoftwarebnomicsforinferringandvisualizingbayesiannetworksfromheterogeneousbigbiologicalandgeneticdata AT rodinandreis newalgorithmandsoftwarebnomicsforinferringandvisualizingbayesiannetworksfromheterogeneousbigbiologicalandgeneticdata |