Cargando…

CAGER: classification analysis of gene expression regulation using multiple information sources

BACKGROUND: Many classification approaches have been applied to analyzing transcriptional regulation of gene expressions. These methods build models that can explain a gene's expression level from the regulatory elements (features) on its promoter sequence. Different types of features, such as...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ruan, Jianhua, Zhang, Weixiong
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2005
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1174863/ https://www.ncbi.nlm.nih.gov/pubmed/15890068 http://dx.doi.org/10.1186/1471-2105-6-114

_version_	1782124468586414080
author	Ruan, Jianhua Zhang, Weixiong
author_facet	Ruan, Jianhua Zhang, Weixiong
author_sort	Ruan, Jianhua
collection	PubMed
description	BACKGROUND: Many classification approaches have been applied to analyzing transcriptional regulation of gene expressions. These methods build models that can explain a gene's expression level from the regulatory elements (features) on its promoter sequence. Different types of features, such as experimentally verified binding motifs, motifs discovered by computer programs, or transcription factor binding data measured with Chromatin Immunoprecipitation (ChIP) assays, have been used towards this goal. Each type of features has been shown successful in modeling gene transcriptional regulation under certain conditions. However, no comparison has been made to evaluate the relative merit of these features. Furthermore, most publicly available classification tools were not designed specifically for modeling transcriptional regulation, and do not allow the user to combine different types of features. RESULTS: In this study, we use a specific classification method, decision trees, to model transcriptional regulation in yeast with features based on predefined motifs, automatically identified motifs, ChlP-chip data, or their combinations. We compare the accuracies and stability of these models, and analyze their capabilities in identifying functionally related genes. Furthermore, we design and implement a user-friendly web server called CAGER (Classification Analysis of Gene Expression Regulation) that integrates several software components for automated analysis of transcriptional regulation using decision trees. Finally, we use CAGER to study the transcriptional regulation of Arabidopsis genes in response to abscisic acid, and report some interesting new results. CONCLUSION: Models built with ChlP-chip data suffer from low accuracies when the condition under which gene expressions are measured is significantly different from the condition under which the ChIP experiment is conducted. Models built with automatically identified motifs can sometimes discover new features, but their modeling accuracies may have been over-estimated in previous studies. Furthermore, models built with automatically identified motifs are not stable with respect to noises. A combination of ChlP-chip data and predefined motifs can substantially improve modeling accuracies, and is effective in identifying true regulons. The CAGER web server, which is freely available at , allows the user to select combinations of different feature types for building decision trees, and interact with the models graphically. We believe that it will be a useful tool to facilitate the discovery of gene transcriptional regulatory networks.
format	Text
id	pubmed-1174863
institution	National Center for Biotechnology Information
language	English
publishDate	2005
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-11748632005-07-09 CAGER: classification analysis of gene expression regulation using multiple information sources Ruan, Jianhua Zhang, Weixiong BMC Bioinformatics Research Article BACKGROUND: Many classification approaches have been applied to analyzing transcriptional regulation of gene expressions. These methods build models that can explain a gene's expression level from the regulatory elements (features) on its promoter sequence. Different types of features, such as experimentally verified binding motifs, motifs discovered by computer programs, or transcription factor binding data measured with Chromatin Immunoprecipitation (ChIP) assays, have been used towards this goal. Each type of features has been shown successful in modeling gene transcriptional regulation under certain conditions. However, no comparison has been made to evaluate the relative merit of these features. Furthermore, most publicly available classification tools were not designed specifically for modeling transcriptional regulation, and do not allow the user to combine different types of features. RESULTS: In this study, we use a specific classification method, decision trees, to model transcriptional regulation in yeast with features based on predefined motifs, automatically identified motifs, ChlP-chip data, or their combinations. We compare the accuracies and stability of these models, and analyze their capabilities in identifying functionally related genes. Furthermore, we design and implement a user-friendly web server called CAGER (Classification Analysis of Gene Expression Regulation) that integrates several software components for automated analysis of transcriptional regulation using decision trees. Finally, we use CAGER to study the transcriptional regulation of Arabidopsis genes in response to abscisic acid, and report some interesting new results. CONCLUSION: Models built with ChlP-chip data suffer from low accuracies when the condition under which gene expressions are measured is significantly different from the condition under which the ChIP experiment is conducted. Models built with automatically identified motifs can sometimes discover new features, but their modeling accuracies may have been over-estimated in previous studies. Furthermore, models built with automatically identified motifs are not stable with respect to noises. A combination of ChlP-chip data and predefined motifs can substantially improve modeling accuracies, and is effective in identifying true regulons. The CAGER web server, which is freely available at , allows the user to select combinations of different feature types for building decision trees, and interact with the models graphically. We believe that it will be a useful tool to facilitate the discovery of gene transcriptional regulatory networks. BioMed Central 2005-05-12 /pmc/articles/PMC1174863/ /pubmed/15890068 http://dx.doi.org/10.1186/1471-2105-6-114 Text en Copyright © 2005 Ruan and Zhang; licensee BioMed Central Ltd.
spellingShingle	Research Article Ruan, Jianhua Zhang, Weixiong CAGER: classification analysis of gene expression regulation using multiple information sources
title	CAGER: classification analysis of gene expression regulation using multiple information sources
title_full	CAGER: classification analysis of gene expression regulation using multiple information sources
title_fullStr	CAGER: classification analysis of gene expression regulation using multiple information sources
title_full_unstemmed	CAGER: classification analysis of gene expression regulation using multiple information sources
title_short	CAGER: classification analysis of gene expression regulation using multiple information sources
title_sort	cager: classification analysis of gene expression regulation using multiple information sources
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1174863/ https://www.ncbi.nlm.nih.gov/pubmed/15890068 http://dx.doi.org/10.1186/1471-2105-6-114
work_keys_str_mv	AT ruanjianhua cagerclassificationanalysisofgeneexpressionregulationusingmultipleinformationsources AT zhangweixiong cagerclassificationanalysisofgeneexpressionregulationusingmultipleinformationsources

CAGER: classification analysis of gene expression regulation using multiple information sources

Ejemplares similares