Cargando…

Query-based biclustering of gene expression data using Probabilistic Relational Models

BACKGROUND: With the availability of large scale expression compendia it is now possible to view own findings in the light of what is already available and retrieve genes with an expression profile similar to a set of genes of interest (i.e., a query or seed set) for a subset of conditions. To that...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Hui, Cloots, Lore, Van den Bulcke, Tim, Wu, Yan, De Smet, Riet, Storms, Valerie, Meysman, Pieter, Engelen, Kristof, Marchal, Kathleen
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3044293/
https://www.ncbi.nlm.nih.gov/pubmed/21342568
http://dx.doi.org/10.1186/1471-2105-12-S1-S37
_version_ 1782198712870633472
author Zhao, Hui
Cloots, Lore
Van den Bulcke, Tim
Wu, Yan
De Smet, Riet
Storms, Valerie
Meysman, Pieter
Engelen, Kristof
Marchal, Kathleen
author_facet Zhao, Hui
Cloots, Lore
Van den Bulcke, Tim
Wu, Yan
De Smet, Riet
Storms, Valerie
Meysman, Pieter
Engelen, Kristof
Marchal, Kathleen
author_sort Zhao, Hui
collection PubMed
description BACKGROUND: With the availability of large scale expression compendia it is now possible to view own findings in the light of what is already available and retrieve genes with an expression profile similar to a set of genes of interest (i.e., a query or seed set) for a subset of conditions. To that end, a query-based strategy is needed that maximally exploits the coexpression behaviour of the seed genes to guide the biclustering, but that at the same time is robust against the presence of noisy genes in the seed set as seed genes are often assumed, but not guaranteed to be coexpressed in the queried compendium. Therefore, we developed ProBic, a query-based biclustering strategy based on Probabilistic Relational Models (PRMs) that exploits the use of prior distributions to extract the information contained within the seed set. RESULTS: We applied ProBic on a large scale Escherichia coli compendium to extend partially described regulons with potentially novel members. We compared ProBic's performance with previously published query-based biclustering algorithms, namely ISA and QDB, from the perspective of bicluster expression quality, robustness of the outcome against noisy seed sets and biological relevance. This comparison learns that ProBic is able to retrieve biologically relevant, high quality biclusters that retain their seed genes and that it is particularly strong in handling noisy seeds. CONCLUSIONS: ProBic is a query-based biclustering algorithm developed in a flexible framework, designed to detect biologically relevant, high quality biclusters that retain relevant seed genes even in the presence of noise or when dealing with low quality seed sets.
format Text
id pubmed-3044293
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30442932011-02-25 Query-based biclustering of gene expression data using Probabilistic Relational Models Zhao, Hui Cloots, Lore Van den Bulcke, Tim Wu, Yan De Smet, Riet Storms, Valerie Meysman, Pieter Engelen, Kristof Marchal, Kathleen BMC Bioinformatics Research BACKGROUND: With the availability of large scale expression compendia it is now possible to view own findings in the light of what is already available and retrieve genes with an expression profile similar to a set of genes of interest (i.e., a query or seed set) for a subset of conditions. To that end, a query-based strategy is needed that maximally exploits the coexpression behaviour of the seed genes to guide the biclustering, but that at the same time is robust against the presence of noisy genes in the seed set as seed genes are often assumed, but not guaranteed to be coexpressed in the queried compendium. Therefore, we developed ProBic, a query-based biclustering strategy based on Probabilistic Relational Models (PRMs) that exploits the use of prior distributions to extract the information contained within the seed set. RESULTS: We applied ProBic on a large scale Escherichia coli compendium to extend partially described regulons with potentially novel members. We compared ProBic's performance with previously published query-based biclustering algorithms, namely ISA and QDB, from the perspective of bicluster expression quality, robustness of the outcome against noisy seed sets and biological relevance. This comparison learns that ProBic is able to retrieve biologically relevant, high quality biclusters that retain their seed genes and that it is particularly strong in handling noisy seeds. CONCLUSIONS: ProBic is a query-based biclustering algorithm developed in a flexible framework, designed to detect biologically relevant, high quality biclusters that retain relevant seed genes even in the presence of noise or when dealing with low quality seed sets. BioMed Central 2011-02-15 /pmc/articles/PMC3044293/ /pubmed/21342568 http://dx.doi.org/10.1186/1471-2105-12-S1-S37 Text en Copyright ©2011 Zhao et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Zhao, Hui
Cloots, Lore
Van den Bulcke, Tim
Wu, Yan
De Smet, Riet
Storms, Valerie
Meysman, Pieter
Engelen, Kristof
Marchal, Kathleen
Query-based biclustering of gene expression data using Probabilistic Relational Models
title Query-based biclustering of gene expression data using Probabilistic Relational Models
title_full Query-based biclustering of gene expression data using Probabilistic Relational Models
title_fullStr Query-based biclustering of gene expression data using Probabilistic Relational Models
title_full_unstemmed Query-based biclustering of gene expression data using Probabilistic Relational Models
title_short Query-based biclustering of gene expression data using Probabilistic Relational Models
title_sort query-based biclustering of gene expression data using probabilistic relational models
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3044293/
https://www.ncbi.nlm.nih.gov/pubmed/21342568
http://dx.doi.org/10.1186/1471-2105-12-S1-S37
work_keys_str_mv AT zhaohui querybasedbiclusteringofgeneexpressiondatausingprobabilisticrelationalmodels
AT clootslore querybasedbiclusteringofgeneexpressiondatausingprobabilisticrelationalmodels
AT vandenbulcketim querybasedbiclusteringofgeneexpressiondatausingprobabilisticrelationalmodels
AT wuyan querybasedbiclusteringofgeneexpressiondatausingprobabilisticrelationalmodels
AT desmetriet querybasedbiclusteringofgeneexpressiondatausingprobabilisticrelationalmodels
AT stormsvalerie querybasedbiclusteringofgeneexpressiondatausingprobabilisticrelationalmodels
AT meysmanpieter querybasedbiclusteringofgeneexpressiondatausingprobabilisticrelationalmodels
AT engelenkristof querybasedbiclusteringofgeneexpressiondatausingprobabilisticrelationalmodels
AT marchalkathleen querybasedbiclusteringofgeneexpressiondatausingprobabilisticrelationalmodels