Cargando…
CAGER: classification analysis of gene expression regulation using multiple information sources
BACKGROUND: Many classification approaches have been applied to analyzing transcriptional regulation of gene expressions. These methods build models that can explain a gene's expression level from the regulatory elements (features) on its promoter sequence. Different types of features, such as...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2005
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1174863/ https://www.ncbi.nlm.nih.gov/pubmed/15890068 http://dx.doi.org/10.1186/1471-2105-6-114 |
_version_ | 1782124468586414080 |
---|---|
author | Ruan, Jianhua Zhang, Weixiong |
author_facet | Ruan, Jianhua Zhang, Weixiong |
author_sort | Ruan, Jianhua |
collection | PubMed |
description | BACKGROUND: Many classification approaches have been applied to analyzing transcriptional regulation of gene expressions. These methods build models that can explain a gene's expression level from the regulatory elements (features) on its promoter sequence. Different types of features, such as experimentally verified binding motifs, motifs discovered by computer programs, or transcription factor binding data measured with Chromatin Immunoprecipitation (ChIP) assays, have been used towards this goal. Each type of features has been shown successful in modeling gene transcriptional regulation under certain conditions. However, no comparison has been made to evaluate the relative merit of these features. Furthermore, most publicly available classification tools were not designed specifically for modeling transcriptional regulation, and do not allow the user to combine different types of features. RESULTS: In this study, we use a specific classification method, decision trees, to model transcriptional regulation in yeast with features based on predefined motifs, automatically identified motifs, ChlP-chip data, or their combinations. We compare the accuracies and stability of these models, and analyze their capabilities in identifying functionally related genes. Furthermore, we design and implement a user-friendly web server called CAGER (Classification Analysis of Gene Expression Regulation) that integrates several software components for automated analysis of transcriptional regulation using decision trees. Finally, we use CAGER to study the transcriptional regulation of Arabidopsis genes in response to abscisic acid, and report some interesting new results. CONCLUSION: Models built with ChlP-chip data suffer from low accuracies when the condition under which gene expressions are measured is significantly different from the condition under which the ChIP experiment is conducted. Models built with automatically identified motifs can sometimes discover new features, but their modeling accuracies may have been over-estimated in previous studies. Furthermore, models built with automatically identified motifs are not stable with respect to noises. A combination of ChlP-chip data and predefined motifs can substantially improve modeling accuracies, and is effective in identifying true regulons. The CAGER web server, which is freely available at , allows the user to select combinations of different feature types for building decision trees, and interact with the models graphically. We believe that it will be a useful tool to facilitate the discovery of gene transcriptional regulatory networks. |
format | Text |
id | pubmed-1174863 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2005 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-11748632005-07-09 CAGER: classification analysis of gene expression regulation using multiple information sources Ruan, Jianhua Zhang, Weixiong BMC Bioinformatics Research Article BACKGROUND: Many classification approaches have been applied to analyzing transcriptional regulation of gene expressions. These methods build models that can explain a gene's expression level from the regulatory elements (features) on its promoter sequence. Different types of features, such as experimentally verified binding motifs, motifs discovered by computer programs, or transcription factor binding data measured with Chromatin Immunoprecipitation (ChIP) assays, have been used towards this goal. Each type of features has been shown successful in modeling gene transcriptional regulation under certain conditions. However, no comparison has been made to evaluate the relative merit of these features. Furthermore, most publicly available classification tools were not designed specifically for modeling transcriptional regulation, and do not allow the user to combine different types of features. RESULTS: In this study, we use a specific classification method, decision trees, to model transcriptional regulation in yeast with features based on predefined motifs, automatically identified motifs, ChlP-chip data, or their combinations. We compare the accuracies and stability of these models, and analyze their capabilities in identifying functionally related genes. Furthermore, we design and implement a user-friendly web server called CAGER (Classification Analysis of Gene Expression Regulation) that integrates several software components for automated analysis of transcriptional regulation using decision trees. Finally, we use CAGER to study the transcriptional regulation of Arabidopsis genes in response to abscisic acid, and report some interesting new results. CONCLUSION: Models built with ChlP-chip data suffer from low accuracies when the condition under which gene expressions are measured is significantly different from the condition under which the ChIP experiment is conducted. Models built with automatically identified motifs can sometimes discover new features, but their modeling accuracies may have been over-estimated in previous studies. Furthermore, models built with automatically identified motifs are not stable with respect to noises. A combination of ChlP-chip data and predefined motifs can substantially improve modeling accuracies, and is effective in identifying true regulons. The CAGER web server, which is freely available at , allows the user to select combinations of different feature types for building decision trees, and interact with the models graphically. We believe that it will be a useful tool to facilitate the discovery of gene transcriptional regulatory networks. BioMed Central 2005-05-12 /pmc/articles/PMC1174863/ /pubmed/15890068 http://dx.doi.org/10.1186/1471-2105-6-114 Text en Copyright © 2005 Ruan and Zhang; licensee BioMed Central Ltd. |
spellingShingle | Research Article Ruan, Jianhua Zhang, Weixiong CAGER: classification analysis of gene expression regulation using multiple information sources |
title | CAGER: classification analysis of gene expression regulation using multiple information sources |
title_full | CAGER: classification analysis of gene expression regulation using multiple information sources |
title_fullStr | CAGER: classification analysis of gene expression regulation using multiple information sources |
title_full_unstemmed | CAGER: classification analysis of gene expression regulation using multiple information sources |
title_short | CAGER: classification analysis of gene expression regulation using multiple information sources |
title_sort | cager: classification analysis of gene expression regulation using multiple information sources |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1174863/ https://www.ncbi.nlm.nih.gov/pubmed/15890068 http://dx.doi.org/10.1186/1471-2105-6-114 |
work_keys_str_mv | AT ruanjianhua cagerclassificationanalysisofgeneexpressionregulationusingmultipleinformationsources AT zhangweixiong cagerclassificationanalysisofgeneexpressionregulationusingmultipleinformationsources |