Cargando…

Incorporating biological prior knowledge for Bayesian learning via maximal knowledge-driven information priors

BACKGROUND: Phenotypic classification is problematic because small samples are ubiquitous; and, for these, use of prior knowledge is critical. If knowledge concerning the feature-label distribution – for instance, genetic pathways – is available, then it can be used in learning. Optimal Bayesian cla...

Descripción completa

Detalles Bibliográficos
Autores principales:	Boluki, Shahin, Esfahani, Mohammad Shahrokh, Qian, Xiaoning, Dougherty, Edward R
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2017
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5751802/ https://www.ncbi.nlm.nih.gov/pubmed/29297278 http://dx.doi.org/10.1186/s12859-017-1893-4

_version_	1783290021988532224
author	Boluki, Shahin Esfahani, Mohammad Shahrokh Qian, Xiaoning Dougherty, Edward R
author_facet	Boluki, Shahin Esfahani, Mohammad Shahrokh Qian, Xiaoning Dougherty, Edward R
author_sort	Boluki, Shahin
collection	PubMed
description	BACKGROUND: Phenotypic classification is problematic because small samples are ubiquitous; and, for these, use of prior knowledge is critical. If knowledge concerning the feature-label distribution – for instance, genetic pathways – is available, then it can be used in learning. Optimal Bayesian classification provides optimal classification under model uncertainty. It differs from classical Bayesian methods in which a classification model is assumed and prior distributions are placed on model parameters. With optimal Bayesian classification, uncertainty is treated directly on the feature-label distribution, which assures full utilization of prior knowledge and is guaranteed to outperform classical methods. RESULTS: The salient problem confronting optimal Bayesian classification is prior construction. In this paper, we propose a new prior construction methodology based on a general framework of constraints in the form of conditional probability statements. We call this prior the maximal knowledge-driven information prior (MKDIP). The new constraint framework is more flexible than our previous methods as it naturally handles the potential inconsistency in archived regulatory relationships and conditioning can be augmented by other knowledge, such as population statistics. We also extend the application of prior construction to a multinomial mixture model when labels are unknown, which often occurs in practice. The performance of the proposed methods is examined on two important pathway families, the mammalian cell-cycle and a set of p53-related pathways, and also on a publicly available gene expression dataset of non-small cell lung cancer when combined with the existing prior knowledge on relevant signaling pathways. CONCLUSION: The new proposed general prior construction framework extends the prior construction methodology to a more flexible framework that results in better inference when proper prior knowledge exists. Moreover, the extension of optimal Bayesian classification to multinomial mixtures where data sets are both small and unlabeled, enables superior classifier design using small, unstructured data sets. We have demonstrated the effectiveness of our approach using pathway information and available knowledge of gene regulating functions; however, the underlying theory can be applied to a wide variety of knowledge types, and other applications when there are small samples.
format	Online Article Text
id	pubmed-5751802
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-57518022018-01-05 Incorporating biological prior knowledge for Bayesian learning via maximal knowledge-driven information priors Boluki, Shahin Esfahani, Mohammad Shahrokh Qian, Xiaoning Dougherty, Edward R BMC Bioinformatics Research BACKGROUND: Phenotypic classification is problematic because small samples are ubiquitous; and, for these, use of prior knowledge is critical. If knowledge concerning the feature-label distribution – for instance, genetic pathways – is available, then it can be used in learning. Optimal Bayesian classification provides optimal classification under model uncertainty. It differs from classical Bayesian methods in which a classification model is assumed and prior distributions are placed on model parameters. With optimal Bayesian classification, uncertainty is treated directly on the feature-label distribution, which assures full utilization of prior knowledge and is guaranteed to outperform classical methods. RESULTS: The salient problem confronting optimal Bayesian classification is prior construction. In this paper, we propose a new prior construction methodology based on a general framework of constraints in the form of conditional probability statements. We call this prior the maximal knowledge-driven information prior (MKDIP). The new constraint framework is more flexible than our previous methods as it naturally handles the potential inconsistency in archived regulatory relationships and conditioning can be augmented by other knowledge, such as population statistics. We also extend the application of prior construction to a multinomial mixture model when labels are unknown, which often occurs in practice. The performance of the proposed methods is examined on two important pathway families, the mammalian cell-cycle and a set of p53-related pathways, and also on a publicly available gene expression dataset of non-small cell lung cancer when combined with the existing prior knowledge on relevant signaling pathways. CONCLUSION: The new proposed general prior construction framework extends the prior construction methodology to a more flexible framework that results in better inference when proper prior knowledge exists. Moreover, the extension of optimal Bayesian classification to multinomial mixtures where data sets are both small and unlabeled, enables superior classifier design using small, unstructured data sets. We have demonstrated the effectiveness of our approach using pathway information and available knowledge of gene regulating functions; however, the underlying theory can be applied to a wide variety of knowledge types, and other applications when there are small samples. BioMed Central 2017-12-28 /pmc/articles/PMC5751802/ /pubmed/29297278 http://dx.doi.org/10.1186/s12859-017-1893-4 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Boluki, Shahin Esfahani, Mohammad Shahrokh Qian, Xiaoning Dougherty, Edward R Incorporating biological prior knowledge for Bayesian learning via maximal knowledge-driven information priors
title	Incorporating biological prior knowledge for Bayesian learning via maximal knowledge-driven information priors
title_full	Incorporating biological prior knowledge for Bayesian learning via maximal knowledge-driven information priors
title_fullStr	Incorporating biological prior knowledge for Bayesian learning via maximal knowledge-driven information priors
title_full_unstemmed	Incorporating biological prior knowledge for Bayesian learning via maximal knowledge-driven information priors
title_short	Incorporating biological prior knowledge for Bayesian learning via maximal knowledge-driven information priors
title_sort	incorporating biological prior knowledge for bayesian learning via maximal knowledge-driven information priors
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5751802/ https://www.ncbi.nlm.nih.gov/pubmed/29297278 http://dx.doi.org/10.1186/s12859-017-1893-4
work_keys_str_mv	AT bolukishahin incorporatingbiologicalpriorknowledgeforbayesianlearningviamaximalknowledgedriveninformationpriors AT esfahanimohammadshahrokh incorporatingbiologicalpriorknowledgeforbayesianlearningviamaximalknowledgedriveninformationpriors AT qianxiaoning incorporatingbiologicalpriorknowledgeforbayesianlearningviamaximalknowledgedriveninformationpriors AT doughertyedwardr incorporatingbiologicalpriorknowledgeforbayesianlearningviamaximalknowledgedriveninformationpriors

Incorporating biological prior knowledge for Bayesian learning via maximal knowledge-driven information priors

Ejemplares similares