Cargando…

Using Stochastic Causal Trees to Augment Bayesian Networks for Modeling eQTL Datasets

BACKGROUND: The combination of genotypic and genome-wide expression data arising from segregating populations offers an unprecedented opportunity to model and dissect complex phenotypes. The immense potential offered by these data derives from the fact that genotypic variation is the sole source of...

Descripción completa

Detalles Bibliográficos
Autores principales: Chipman, Kyle C, Singh, Ambuj K
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3032670/
https://www.ncbi.nlm.nih.gov/pubmed/21211042
http://dx.doi.org/10.1186/1471-2105-12-7
_version_ 1782197478906396672
author Chipman, Kyle C
Singh, Ambuj K
author_facet Chipman, Kyle C
Singh, Ambuj K
author_sort Chipman, Kyle C
collection PubMed
description BACKGROUND: The combination of genotypic and genome-wide expression data arising from segregating populations offers an unprecedented opportunity to model and dissect complex phenotypes. The immense potential offered by these data derives from the fact that genotypic variation is the sole source of perturbation and can therefore be used to reconcile changes in gene expression programs with the parental genotypes. To date, several methodologies have been developed for modeling eQTL data. These methods generally leverage genotypic data to resolve causal relationships among gene pairs implicated as associates in the expression data. In particular, leading studies have augmented Bayesian networks with genotypic data, providing a powerful framework for learning and modeling causal relationships. While these initial efforts have provided promising results, one major drawback associated with these methods is that they are generally limited to resolving causal orderings for transcripts most proximal to the genomic loci. In this manuscript, we present a probabilistic method capable of learning the causal relationships between transcripts at all levels in the network. We use the information provided by our method as a prior for Bayesian network structure learning, resulting in enhanced performance for gene network reconstruction. RESULTS: Using established protocols to synthesize eQTL networks and corresponding data, we show that our method achieves improved performance over existing leading methods. For the goal of gene network reconstruction, our method achieves improvements in recall ranging from 20% to 90% across a broad range of precision levels and for datasets of varying sample sizes. Additionally, we show that the learned networks can be utilized for expression quantitative trait loci mapping, resulting in upwards of 10-fold increases in recall over traditional univariate mapping. CONCLUSIONS: Using the information from our method as a prior for Bayesian network structure learning yields large improvements in accuracy for the tasks of gene network reconstruction and expression quantitative trait loci mapping. In particular, our method is effective for establishing causal relationships between transcripts located both proximally and distally from genomic loci.
format Text
id pubmed-3032670
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30326702011-02-04 Using Stochastic Causal Trees to Augment Bayesian Networks for Modeling eQTL Datasets Chipman, Kyle C Singh, Ambuj K BMC Bioinformatics Research Article BACKGROUND: The combination of genotypic and genome-wide expression data arising from segregating populations offers an unprecedented opportunity to model and dissect complex phenotypes. The immense potential offered by these data derives from the fact that genotypic variation is the sole source of perturbation and can therefore be used to reconcile changes in gene expression programs with the parental genotypes. To date, several methodologies have been developed for modeling eQTL data. These methods generally leverage genotypic data to resolve causal relationships among gene pairs implicated as associates in the expression data. In particular, leading studies have augmented Bayesian networks with genotypic data, providing a powerful framework for learning and modeling causal relationships. While these initial efforts have provided promising results, one major drawback associated with these methods is that they are generally limited to resolving causal orderings for transcripts most proximal to the genomic loci. In this manuscript, we present a probabilistic method capable of learning the causal relationships between transcripts at all levels in the network. We use the information provided by our method as a prior for Bayesian network structure learning, resulting in enhanced performance for gene network reconstruction. RESULTS: Using established protocols to synthesize eQTL networks and corresponding data, we show that our method achieves improved performance over existing leading methods. For the goal of gene network reconstruction, our method achieves improvements in recall ranging from 20% to 90% across a broad range of precision levels and for datasets of varying sample sizes. Additionally, we show that the learned networks can be utilized for expression quantitative trait loci mapping, resulting in upwards of 10-fold increases in recall over traditional univariate mapping. CONCLUSIONS: Using the information from our method as a prior for Bayesian network structure learning yields large improvements in accuracy for the tasks of gene network reconstruction and expression quantitative trait loci mapping. In particular, our method is effective for establishing causal relationships between transcripts located both proximally and distally from genomic loci. BioMed Central 2011-01-06 /pmc/articles/PMC3032670/ /pubmed/21211042 http://dx.doi.org/10.1186/1471-2105-12-7 Text en Copyright ©2011 Chipman and Singh; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Chipman, Kyle C
Singh, Ambuj K
Using Stochastic Causal Trees to Augment Bayesian Networks for Modeling eQTL Datasets
title Using Stochastic Causal Trees to Augment Bayesian Networks for Modeling eQTL Datasets
title_full Using Stochastic Causal Trees to Augment Bayesian Networks for Modeling eQTL Datasets
title_fullStr Using Stochastic Causal Trees to Augment Bayesian Networks for Modeling eQTL Datasets
title_full_unstemmed Using Stochastic Causal Trees to Augment Bayesian Networks for Modeling eQTL Datasets
title_short Using Stochastic Causal Trees to Augment Bayesian Networks for Modeling eQTL Datasets
title_sort using stochastic causal trees to augment bayesian networks for modeling eqtl datasets
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3032670/
https://www.ncbi.nlm.nih.gov/pubmed/21211042
http://dx.doi.org/10.1186/1471-2105-12-7
work_keys_str_mv AT chipmankylec usingstochasticcausaltreestoaugmentbayesiannetworksformodelingeqtldatasets
AT singhambujk usingstochasticcausaltreestoaugmentbayesiannetworksformodelingeqtldatasets