Cargando…
Query-based biclustering of gene expression data using Probabilistic Relational Models
BACKGROUND: With the availability of large scale expression compendia it is now possible to view own findings in the light of what is already available and retrieve genes with an expression profile similar to a set of genes of interest (i.e., a query or seed set) for a subset of conditions. To that...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3044293/ https://www.ncbi.nlm.nih.gov/pubmed/21342568 http://dx.doi.org/10.1186/1471-2105-12-S1-S37 |
_version_ | 1782198712870633472 |
---|---|
author | Zhao, Hui Cloots, Lore Van den Bulcke, Tim Wu, Yan De Smet, Riet Storms, Valerie Meysman, Pieter Engelen, Kristof Marchal, Kathleen |
author_facet | Zhao, Hui Cloots, Lore Van den Bulcke, Tim Wu, Yan De Smet, Riet Storms, Valerie Meysman, Pieter Engelen, Kristof Marchal, Kathleen |
author_sort | Zhao, Hui |
collection | PubMed |
description | BACKGROUND: With the availability of large scale expression compendia it is now possible to view own findings in the light of what is already available and retrieve genes with an expression profile similar to a set of genes of interest (i.e., a query or seed set) for a subset of conditions. To that end, a query-based strategy is needed that maximally exploits the coexpression behaviour of the seed genes to guide the biclustering, but that at the same time is robust against the presence of noisy genes in the seed set as seed genes are often assumed, but not guaranteed to be coexpressed in the queried compendium. Therefore, we developed ProBic, a query-based biclustering strategy based on Probabilistic Relational Models (PRMs) that exploits the use of prior distributions to extract the information contained within the seed set. RESULTS: We applied ProBic on a large scale Escherichia coli compendium to extend partially described regulons with potentially novel members. We compared ProBic's performance with previously published query-based biclustering algorithms, namely ISA and QDB, from the perspective of bicluster expression quality, robustness of the outcome against noisy seed sets and biological relevance. This comparison learns that ProBic is able to retrieve biologically relevant, high quality biclusters that retain their seed genes and that it is particularly strong in handling noisy seeds. CONCLUSIONS: ProBic is a query-based biclustering algorithm developed in a flexible framework, designed to detect biologically relevant, high quality biclusters that retain relevant seed genes even in the presence of noise or when dealing with low quality seed sets. |
format | Text |
id | pubmed-3044293 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-30442932011-02-25 Query-based biclustering of gene expression data using Probabilistic Relational Models Zhao, Hui Cloots, Lore Van den Bulcke, Tim Wu, Yan De Smet, Riet Storms, Valerie Meysman, Pieter Engelen, Kristof Marchal, Kathleen BMC Bioinformatics Research BACKGROUND: With the availability of large scale expression compendia it is now possible to view own findings in the light of what is already available and retrieve genes with an expression profile similar to a set of genes of interest (i.e., a query or seed set) for a subset of conditions. To that end, a query-based strategy is needed that maximally exploits the coexpression behaviour of the seed genes to guide the biclustering, but that at the same time is robust against the presence of noisy genes in the seed set as seed genes are often assumed, but not guaranteed to be coexpressed in the queried compendium. Therefore, we developed ProBic, a query-based biclustering strategy based on Probabilistic Relational Models (PRMs) that exploits the use of prior distributions to extract the information contained within the seed set. RESULTS: We applied ProBic on a large scale Escherichia coli compendium to extend partially described regulons with potentially novel members. We compared ProBic's performance with previously published query-based biclustering algorithms, namely ISA and QDB, from the perspective of bicluster expression quality, robustness of the outcome against noisy seed sets and biological relevance. This comparison learns that ProBic is able to retrieve biologically relevant, high quality biclusters that retain their seed genes and that it is particularly strong in handling noisy seeds. CONCLUSIONS: ProBic is a query-based biclustering algorithm developed in a flexible framework, designed to detect biologically relevant, high quality biclusters that retain relevant seed genes even in the presence of noise or when dealing with low quality seed sets. BioMed Central 2011-02-15 /pmc/articles/PMC3044293/ /pubmed/21342568 http://dx.doi.org/10.1186/1471-2105-12-S1-S37 Text en Copyright ©2011 Zhao et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Zhao, Hui Cloots, Lore Van den Bulcke, Tim Wu, Yan De Smet, Riet Storms, Valerie Meysman, Pieter Engelen, Kristof Marchal, Kathleen Query-based biclustering of gene expression data using Probabilistic Relational Models |
title | Query-based biclustering of gene expression data using Probabilistic Relational Models |
title_full | Query-based biclustering of gene expression data using Probabilistic Relational Models |
title_fullStr | Query-based biclustering of gene expression data using Probabilistic Relational Models |
title_full_unstemmed | Query-based biclustering of gene expression data using Probabilistic Relational Models |
title_short | Query-based biclustering of gene expression data using Probabilistic Relational Models |
title_sort | query-based biclustering of gene expression data using probabilistic relational models |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3044293/ https://www.ncbi.nlm.nih.gov/pubmed/21342568 http://dx.doi.org/10.1186/1471-2105-12-S1-S37 |
work_keys_str_mv | AT zhaohui querybasedbiclusteringofgeneexpressiondatausingprobabilisticrelationalmodels AT clootslore querybasedbiclusteringofgeneexpressiondatausingprobabilisticrelationalmodels AT vandenbulcketim querybasedbiclusteringofgeneexpressiondatausingprobabilisticrelationalmodels AT wuyan querybasedbiclusteringofgeneexpressiondatausingprobabilisticrelationalmodels AT desmetriet querybasedbiclusteringofgeneexpressiondatausingprobabilisticrelationalmodels AT stormsvalerie querybasedbiclusteringofgeneexpressiondatausingprobabilisticrelationalmodels AT meysmanpieter querybasedbiclusteringofgeneexpressiondatausingprobabilisticrelationalmodels AT engelenkristof querybasedbiclusteringofgeneexpressiondatausingprobabilisticrelationalmodels AT marchalkathleen querybasedbiclusteringofgeneexpressiondatausingprobabilisticrelationalmodels |