Cargando…

Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks

BACKGROUND: The learning of global genetic regulatory networks from expression data is a severely under-constrained problem that is aided by reducing the dimensionality of the search space by means of clustering genes into putatively co-regulated groups, as opposed to those that are simply co-expres...

Descripción completa

Detalles Bibliográficos
Autores principales:	Reiss, David J, Baliga, Nitin S, Bonneau, Richard
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2006
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1502140/ https://www.ncbi.nlm.nih.gov/pubmed/16749936 http://dx.doi.org/10.1186/1471-2105-7-280

_version_	1782128426758438912
author	Reiss, David J Baliga, Nitin S Bonneau, Richard
author_facet	Reiss, David J Baliga, Nitin S Bonneau, Richard
author_sort	Reiss, David J
collection	PubMed
description	BACKGROUND: The learning of global genetic regulatory networks from expression data is a severely under-constrained problem that is aided by reducing the dimensionality of the search space by means of clustering genes into putatively co-regulated groups, as opposed to those that are simply co-expressed. Be cause genes may be co-regulated only across a subset of all observed experimental conditions, biclustering (clustering of genes and conditions) is more appropriate than standard clustering. Co-regulated genes are also often functionally (physically, spatially, genetically, and/or evolutionarily) associated, and such a priori known or pre-computed associations can provide support for appropriately grouping genes. One important association is the presence of one or more common cis-regulatory motifs. In organisms where these motifs are not known, their de novo detection, integrated into the clustering algorithm, can help to guide the process towards more biologically parsimonious solutions. RESULTS: We have developed an algorithm, cMonkey, that detects putative co-regulated gene groupings by integrating the biclustering of gene expression data and various functional associations with the de novo detection of sequence motifs. CONCLUSION: We have applied this procedure to the archaeon Halobacterium NRC-1, as part of our efforts to decipher its regulatory network. In addition, we used cMonkey on public data for three organisms in the other two domains of life: Helicobacter pylori, Saccharomyces cerevisiae, and Escherichia coli. The biclusters detected by cMonkey both recapitulated known biology and enabled novel predictions (some for Halobacterium were subsequently confirmed in the laboratory). For example, it identified the bacteriorhodopsin regulon, assigned additional genes to this regulon with apparently unrelated function, and detected its known promoter motif. We have performed a thorough comparison of cMonkey results against other clustering methods, and find that cMonkey biclusters are more parsimonious with all available evidence for co-regulation.
format	Text
id	pubmed-1502140
institution	National Center for Biotechnology Information
language	English
publishDate	2006
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-15021402006-07-14 Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks Reiss, David J Baliga, Nitin S Bonneau, Richard BMC Bioinformatics Methodology Article BACKGROUND: The learning of global genetic regulatory networks from expression data is a severely under-constrained problem that is aided by reducing the dimensionality of the search space by means of clustering genes into putatively co-regulated groups, as opposed to those that are simply co-expressed. Be cause genes may be co-regulated only across a subset of all observed experimental conditions, biclustering (clustering of genes and conditions) is more appropriate than standard clustering. Co-regulated genes are also often functionally (physically, spatially, genetically, and/or evolutionarily) associated, and such a priori known or pre-computed associations can provide support for appropriately grouping genes. One important association is the presence of one or more common cis-regulatory motifs. In organisms where these motifs are not known, their de novo detection, integrated into the clustering algorithm, can help to guide the process towards more biologically parsimonious solutions. RESULTS: We have developed an algorithm, cMonkey, that detects putative co-regulated gene groupings by integrating the biclustering of gene expression data and various functional associations with the de novo detection of sequence motifs. CONCLUSION: We have applied this procedure to the archaeon Halobacterium NRC-1, as part of our efforts to decipher its regulatory network. In addition, we used cMonkey on public data for three organisms in the other two domains of life: Helicobacter pylori, Saccharomyces cerevisiae, and Escherichia coli. The biclusters detected by cMonkey both recapitulated known biology and enabled novel predictions (some for Halobacterium were subsequently confirmed in the laboratory). For example, it identified the bacteriorhodopsin regulon, assigned additional genes to this regulon with apparently unrelated function, and detected its known promoter motif. We have performed a thorough comparison of cMonkey results against other clustering methods, and find that cMonkey biclusters are more parsimonious with all available evidence for co-regulation. BioMed Central 2006-06-02 /pmc/articles/PMC1502140/ /pubmed/16749936 http://dx.doi.org/10.1186/1471-2105-7-280 Text en Copyright © 2006 Reiss et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Reiss, David J Baliga, Nitin S Bonneau, Richard Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks
title	Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks
title_full	Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks
title_fullStr	Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks
title_full_unstemmed	Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks
title_short	Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks
title_sort	integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1502140/ https://www.ncbi.nlm.nih.gov/pubmed/16749936 http://dx.doi.org/10.1186/1471-2105-7-280
work_keys_str_mv	AT reissdavidj integratedbiclusteringofheterogeneousgenomewidedatasetsfortheinferenceofglobalregulatorynetworks AT baliganitins integratedbiclusteringofheterogeneousgenomewidedatasetsfortheinferenceofglobalregulatorynetworks AT bonneaurichard integratedbiclusteringofheterogeneousgenomewidedatasetsfortheinferenceofglobalregulatorynetworks

Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks

Ejemplares similares