Cargando…

Quantitative utilization of prior biological knowledge in the Bayesian network modeling of gene expression data

BACKGROUND: Bayesian Network (BN) is a powerful approach to reconstructing genetic regulatory networks from gene expression data. However, expression data by itself suffers from high noise and lack of power. Incorporating prior biological knowledge can improve the performance. As each type of prior...

Descripción completa

Detalles Bibliográficos
Autores principales:	Gao, Shouguo, Wang, Xujing
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2011
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3203352/ https://www.ncbi.nlm.nih.gov/pubmed/21884587 http://dx.doi.org/10.1186/1471-2105-12-359

_version_	1782215109742952448
author	Gao, Shouguo Wang, Xujing
author_facet	Gao, Shouguo Wang, Xujing
author_sort	Gao, Shouguo
collection	PubMed
description	BACKGROUND: Bayesian Network (BN) is a powerful approach to reconstructing genetic regulatory networks from gene expression data. However, expression data by itself suffers from high noise and lack of power. Incorporating prior biological knowledge can improve the performance. As each type of prior knowledge on its own may be incomplete or limited by quality issues, integrating multiple sources of prior knowledge to utilize their consensus is desirable. RESULTS: We introduce a new method to incorporate the quantitative information from multiple sources of prior knowledge. It first uses the Naïve Bayesian classifier to assess the likelihood of functional linkage between gene pairs based on prior knowledge. In this study we included cocitation in PubMed and schematic similarity in Gene Ontology annotation. A candidate network edge reservoir is then created in which the copy number of each edge is proportional to the estimated likelihood of linkage between the two corresponding genes. In network simulation the Markov Chain Monte Carlo sampling algorithm is adopted, and samples from this reservoir at each iteration to generate new candidate networks. We evaluated the new algorithm using both simulated and real gene expression data including that from a yeast cell cycle and a mouse pancreas development/growth study. Incorporating prior knowledge led to a ~2 fold increase in the number of known transcription regulations recovered, without significant change in false positive rate. In contrast, without the prior knowledge BN modeling is not always better than a random selection, demonstrating the necessity in network modeling to supplement the gene expression data with additional information. CONCLUSION: our new development provides a statistical means to utilize the quantitative information in prior biological knowledge in the BN modeling of gene expression data, which significantly improves the performance.
format	Online Article Text
id	pubmed-3203352
institution	National Center for Biotechnology Information
language	English
publishDate	2011
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-32033522011-10-31 Quantitative utilization of prior biological knowledge in the Bayesian network modeling of gene expression data Gao, Shouguo Wang, Xujing BMC Bioinformatics Methodology Article BACKGROUND: Bayesian Network (BN) is a powerful approach to reconstructing genetic regulatory networks from gene expression data. However, expression data by itself suffers from high noise and lack of power. Incorporating prior biological knowledge can improve the performance. As each type of prior knowledge on its own may be incomplete or limited by quality issues, integrating multiple sources of prior knowledge to utilize their consensus is desirable. RESULTS: We introduce a new method to incorporate the quantitative information from multiple sources of prior knowledge. It first uses the Naïve Bayesian classifier to assess the likelihood of functional linkage between gene pairs based on prior knowledge. In this study we included cocitation in PubMed and schematic similarity in Gene Ontology annotation. A candidate network edge reservoir is then created in which the copy number of each edge is proportional to the estimated likelihood of linkage between the two corresponding genes. In network simulation the Markov Chain Monte Carlo sampling algorithm is adopted, and samples from this reservoir at each iteration to generate new candidate networks. We evaluated the new algorithm using both simulated and real gene expression data including that from a yeast cell cycle and a mouse pancreas development/growth study. Incorporating prior knowledge led to a ~2 fold increase in the number of known transcription regulations recovered, without significant change in false positive rate. In contrast, without the prior knowledge BN modeling is not always better than a random selection, demonstrating the necessity in network modeling to supplement the gene expression data with additional information. CONCLUSION: our new development provides a statistical means to utilize the quantitative information in prior biological knowledge in the BN modeling of gene expression data, which significantly improves the performance. BioMed Central 2011-08-31 /pmc/articles/PMC3203352/ /pubmed/21884587 http://dx.doi.org/10.1186/1471-2105-12-359 Text en Copyright ©2011 Gao and Wang; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Gao, Shouguo Wang, Xujing Quantitative utilization of prior biological knowledge in the Bayesian network modeling of gene expression data
title	Quantitative utilization of prior biological knowledge in the Bayesian network modeling of gene expression data
title_full	Quantitative utilization of prior biological knowledge in the Bayesian network modeling of gene expression data
title_fullStr	Quantitative utilization of prior biological knowledge in the Bayesian network modeling of gene expression data
title_full_unstemmed	Quantitative utilization of prior biological knowledge in the Bayesian network modeling of gene expression data
title_short	Quantitative utilization of prior biological knowledge in the Bayesian network modeling of gene expression data
title_sort	quantitative utilization of prior biological knowledge in the bayesian network modeling of gene expression data
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3203352/ https://www.ncbi.nlm.nih.gov/pubmed/21884587 http://dx.doi.org/10.1186/1471-2105-12-359
work_keys_str_mv	AT gaoshouguo quantitativeutilizationofpriorbiologicalknowledgeinthebayesiannetworkmodelingofgeneexpressiondata AT wangxujing quantitativeutilizationofpriorbiologicalknowledgeinthebayesiannetworkmodelingofgeneexpressiondata

Quantitative utilization of prior biological knowledge in the Bayesian network modeling of gene expression data

Ejemplares similares