Cargando…

CGBayesNets: Conditional Gaussian Bayesian Network Learning and Inference with Mixed Discrete and Continuous Data

Bayesian Networks (BN) have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous...

Descripción completa

Detalles Bibliográficos
Autores principales: McGeachie, Michael J., Chang, Hsun-Hsien, Weiss, Scott T.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4055564/
https://www.ncbi.nlm.nih.gov/pubmed/24922310
http://dx.doi.org/10.1371/journal.pcbi.1003676
_version_ 1782320678237634560
author McGeachie, Michael J.
Chang, Hsun-Hsien
Weiss, Scott T.
author_facet McGeachie, Michael J.
Chang, Hsun-Hsien
Weiss, Scott T.
author_sort McGeachie, Michael J.
collection PubMed
description Bayesian Networks (BN) have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs) formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com.
format Online
Article
Text
id pubmed-4055564
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-40555642014-06-18 CGBayesNets: Conditional Gaussian Bayesian Network Learning and Inference with Mixed Discrete and Continuous Data McGeachie, Michael J. Chang, Hsun-Hsien Weiss, Scott T. PLoS Comput Biol Research Article Bayesian Networks (BN) have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs) formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com. Public Library of Science 2014-06-12 /pmc/articles/PMC4055564/ /pubmed/24922310 http://dx.doi.org/10.1371/journal.pcbi.1003676 Text en © 2014 McGeachie et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
McGeachie, Michael J.
Chang, Hsun-Hsien
Weiss, Scott T.
CGBayesNets: Conditional Gaussian Bayesian Network Learning and Inference with Mixed Discrete and Continuous Data
title CGBayesNets: Conditional Gaussian Bayesian Network Learning and Inference with Mixed Discrete and Continuous Data
title_full CGBayesNets: Conditional Gaussian Bayesian Network Learning and Inference with Mixed Discrete and Continuous Data
title_fullStr CGBayesNets: Conditional Gaussian Bayesian Network Learning and Inference with Mixed Discrete and Continuous Data
title_full_unstemmed CGBayesNets: Conditional Gaussian Bayesian Network Learning and Inference with Mixed Discrete and Continuous Data
title_short CGBayesNets: Conditional Gaussian Bayesian Network Learning and Inference with Mixed Discrete and Continuous Data
title_sort cgbayesnets: conditional gaussian bayesian network learning and inference with mixed discrete and continuous data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4055564/
https://www.ncbi.nlm.nih.gov/pubmed/24922310
http://dx.doi.org/10.1371/journal.pcbi.1003676
work_keys_str_mv AT mcgeachiemichaelj cgbayesnetsconditionalgaussianbayesiannetworklearningandinferencewithmixeddiscreteandcontinuousdata
AT changhsunhsien cgbayesnetsconditionalgaussianbayesiannetworklearningandinferencewithmixeddiscreteandcontinuousdata
AT weissscottt cgbayesnetsconditionalgaussianbayesiannetworklearningandinferencewithmixeddiscreteandcontinuousdata