Cargando…

Bayesian biclustering of gene expression data

BACKGROUND: Biclustering of gene expression data searches for local patterns of gene expression. A bicluster (or a two-way cluster) is defined as a set of genes whose expression profiles are mutually similar within a subset of experimental conditions/samples. Although several biclustering algorithms...

Descripción completa

Detalles Bibliográficos
Autores principales:	Gu, Jiajun, Liu, Jun S
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2008
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2386069/ https://www.ncbi.nlm.nih.gov/pubmed/18366617 http://dx.doi.org/10.1186/1471-2164-9-S1-S4

_version_	1782155205099388928
author	Gu, Jiajun Liu, Jun S
author_facet	Gu, Jiajun Liu, Jun S
author_sort	Gu, Jiajun
collection	PubMed
description	BACKGROUND: Biclustering of gene expression data searches for local patterns of gene expression. A bicluster (or a two-way cluster) is defined as a set of genes whose expression profiles are mutually similar within a subset of experimental conditions/samples. Although several biclustering algorithms have been studied, few are based on rigorous statistical models. RESULTS: We developed a Bayesian biclustering model (BBC), and implemented a Gibbs sampling procedure for its statistical inference. We showed that Bayesian biclustering model can correctly identify multiple clusters of gene expression data. Using simulated data both from the model and with realistic characters, we demonstrated the BBC algorithm outperforms other methods in both robustness and accuracy. We also showed that the model is stable for two normalization methods, the interquartile range normalization and the smallest quartile range normalization. Applying the BBC algorithm to the yeast expression data, we observed that majority of the biclusters we found are supported by significant biological evidences, such as enrichments of gene functions and transcription factor binding sites in the corresponding promoter sequences. CONCLUSIONS: The BBC algorithm is shown to be a robust model-based biclustering method that can discover biologically significant gene-condition clusters in microarray data. The BBC model can easily handle missing data via Monte Carlo imputation and has the potential to be extended to integrated study of gene transcription networks.
format	Text
id	pubmed-2386069
institution	National Center for Biotechnology Information
language	English
publishDate	2008
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-23860692008-05-15 Bayesian biclustering of gene expression data Gu, Jiajun Liu, Jun S BMC Genomics Research BACKGROUND: Biclustering of gene expression data searches for local patterns of gene expression. A bicluster (or a two-way cluster) is defined as a set of genes whose expression profiles are mutually similar within a subset of experimental conditions/samples. Although several biclustering algorithms have been studied, few are based on rigorous statistical models. RESULTS: We developed a Bayesian biclustering model (BBC), and implemented a Gibbs sampling procedure for its statistical inference. We showed that Bayesian biclustering model can correctly identify multiple clusters of gene expression data. Using simulated data both from the model and with realistic characters, we demonstrated the BBC algorithm outperforms other methods in both robustness and accuracy. We also showed that the model is stable for two normalization methods, the interquartile range normalization and the smallest quartile range normalization. Applying the BBC algorithm to the yeast expression data, we observed that majority of the biclusters we found are supported by significant biological evidences, such as enrichments of gene functions and transcription factor binding sites in the corresponding promoter sequences. CONCLUSIONS: The BBC algorithm is shown to be a robust model-based biclustering method that can discover biologically significant gene-condition clusters in microarray data. The BBC model can easily handle missing data via Monte Carlo imputation and has the potential to be extended to integrated study of gene transcription networks. BioMed Central 2008-03-20 /pmc/articles/PMC2386069/ /pubmed/18366617 http://dx.doi.org/10.1186/1471-2164-9-S1-S4 Text en Copyright © 2008 Gu and Liu; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Gu, Jiajun Liu, Jun S Bayesian biclustering of gene expression data
title	Bayesian biclustering of gene expression data
title_full	Bayesian biclustering of gene expression data
title_fullStr	Bayesian biclustering of gene expression data
title_full_unstemmed	Bayesian biclustering of gene expression data
title_short	Bayesian biclustering of gene expression data
title_sort	bayesian biclustering of gene expression data
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2386069/ https://www.ncbi.nlm.nih.gov/pubmed/18366617 http://dx.doi.org/10.1186/1471-2164-9-S1-S4
work_keys_str_mv	AT gujiajun bayesianbiclusteringofgeneexpressiondata AT liujuns bayesianbiclusteringofgeneexpressiondata

Bayesian biclustering of gene expression data

Ejemplares similares