Cargando…

CAGER: classification analysis of gene expression regulation using multiple information sources

BACKGROUND: Many classification approaches have been applied to analyzing transcriptional regulation of gene expressions. These methods build models that can explain a gene's expression level from the regulatory elements (features) on its promoter sequence. Different types of features, such as...

Descripción completa

Detalles Bibliográficos
Autores principales: Ruan, Jianhua, Zhang, Weixiong
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1174863/
https://www.ncbi.nlm.nih.gov/pubmed/15890068
http://dx.doi.org/10.1186/1471-2105-6-114
_version_ 1782124468586414080
author Ruan, Jianhua
Zhang, Weixiong
author_facet Ruan, Jianhua
Zhang, Weixiong
author_sort Ruan, Jianhua
collection PubMed
description BACKGROUND: Many classification approaches have been applied to analyzing transcriptional regulation of gene expressions. These methods build models that can explain a gene's expression level from the regulatory elements (features) on its promoter sequence. Different types of features, such as experimentally verified binding motifs, motifs discovered by computer programs, or transcription factor binding data measured with Chromatin Immunoprecipitation (ChIP) assays, have been used towards this goal. Each type of features has been shown successful in modeling gene transcriptional regulation under certain conditions. However, no comparison has been made to evaluate the relative merit of these features. Furthermore, most publicly available classification tools were not designed specifically for modeling transcriptional regulation, and do not allow the user to combine different types of features. RESULTS: In this study, we use a specific classification method, decision trees, to model transcriptional regulation in yeast with features based on predefined motifs, automatically identified motifs, ChlP-chip data, or their combinations. We compare the accuracies and stability of these models, and analyze their capabilities in identifying functionally related genes. Furthermore, we design and implement a user-friendly web server called CAGER (Classification Analysis of Gene Expression Regulation) that integrates several software components for automated analysis of transcriptional regulation using decision trees. Finally, we use CAGER to study the transcriptional regulation of Arabidopsis genes in response to abscisic acid, and report some interesting new results. CONCLUSION: Models built with ChlP-chip data suffer from low accuracies when the condition under which gene expressions are measured is significantly different from the condition under which the ChIP experiment is conducted. Models built with automatically identified motifs can sometimes discover new features, but their modeling accuracies may have been over-estimated in previous studies. Furthermore, models built with automatically identified motifs are not stable with respect to noises. A combination of ChlP-chip data and predefined motifs can substantially improve modeling accuracies, and is effective in identifying true regulons. The CAGER web server, which is freely available at , allows the user to select combinations of different feature types for building decision trees, and interact with the models graphically. We believe that it will be a useful tool to facilitate the discovery of gene transcriptional regulatory networks.
format Text
id pubmed-1174863
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-11748632005-07-09 CAGER: classification analysis of gene expression regulation using multiple information sources Ruan, Jianhua Zhang, Weixiong BMC Bioinformatics Research Article BACKGROUND: Many classification approaches have been applied to analyzing transcriptional regulation of gene expressions. These methods build models that can explain a gene's expression level from the regulatory elements (features) on its promoter sequence. Different types of features, such as experimentally verified binding motifs, motifs discovered by computer programs, or transcription factor binding data measured with Chromatin Immunoprecipitation (ChIP) assays, have been used towards this goal. Each type of features has been shown successful in modeling gene transcriptional regulation under certain conditions. However, no comparison has been made to evaluate the relative merit of these features. Furthermore, most publicly available classification tools were not designed specifically for modeling transcriptional regulation, and do not allow the user to combine different types of features. RESULTS: In this study, we use a specific classification method, decision trees, to model transcriptional regulation in yeast with features based on predefined motifs, automatically identified motifs, ChlP-chip data, or their combinations. We compare the accuracies and stability of these models, and analyze their capabilities in identifying functionally related genes. Furthermore, we design and implement a user-friendly web server called CAGER (Classification Analysis of Gene Expression Regulation) that integrates several software components for automated analysis of transcriptional regulation using decision trees. Finally, we use CAGER to study the transcriptional regulation of Arabidopsis genes in response to abscisic acid, and report some interesting new results. CONCLUSION: Models built with ChlP-chip data suffer from low accuracies when the condition under which gene expressions are measured is significantly different from the condition under which the ChIP experiment is conducted. Models built with automatically identified motifs can sometimes discover new features, but their modeling accuracies may have been over-estimated in previous studies. Furthermore, models built with automatically identified motifs are not stable with respect to noises. A combination of ChlP-chip data and predefined motifs can substantially improve modeling accuracies, and is effective in identifying true regulons. The CAGER web server, which is freely available at , allows the user to select combinations of different feature types for building decision trees, and interact with the models graphically. We believe that it will be a useful tool to facilitate the discovery of gene transcriptional regulatory networks. BioMed Central 2005-05-12 /pmc/articles/PMC1174863/ /pubmed/15890068 http://dx.doi.org/10.1186/1471-2105-6-114 Text en Copyright © 2005 Ruan and Zhang; licensee BioMed Central Ltd.
spellingShingle Research Article
Ruan, Jianhua
Zhang, Weixiong
CAGER: classification analysis of gene expression regulation using multiple information sources
title CAGER: classification analysis of gene expression regulation using multiple information sources
title_full CAGER: classification analysis of gene expression regulation using multiple information sources
title_fullStr CAGER: classification analysis of gene expression regulation using multiple information sources
title_full_unstemmed CAGER: classification analysis of gene expression regulation using multiple information sources
title_short CAGER: classification analysis of gene expression regulation using multiple information sources
title_sort cager: classification analysis of gene expression regulation using multiple information sources
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1174863/
https://www.ncbi.nlm.nih.gov/pubmed/15890068
http://dx.doi.org/10.1186/1471-2105-6-114
work_keys_str_mv AT ruanjianhua cagerclassificationanalysisofgeneexpressionregulationusingmultipleinformationsources
AT zhangweixiong cagerclassificationanalysisofgeneexpressionregulationusingmultipleinformationsources