Cargando…

Identification of Yeast Transcriptional Regulation Networks Using Multivariate Random Forests

The recent availability of whole-genome scale data sets that investigate complementary and diverse aspects of transcriptional regulation has spawned an increased need for new and effective computational approaches to analyze and integrate these large scale assays. Here, we propose a novel algorithm,...

Descripción completa

Detalles Bibliográficos
Autores principales: Xiao, Yuanyuan, Segal, Mark R.
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2691601/
https://www.ncbi.nlm.nih.gov/pubmed/19543377
http://dx.doi.org/10.1371/journal.pcbi.1000414
_version_ 1782167888439803904
author Xiao, Yuanyuan
Segal, Mark R.
author_facet Xiao, Yuanyuan
Segal, Mark R.
author_sort Xiao, Yuanyuan
collection PubMed
description The recent availability of whole-genome scale data sets that investigate complementary and diverse aspects of transcriptional regulation has spawned an increased need for new and effective computational approaches to analyze and integrate these large scale assays. Here, we propose a novel algorithm, based on random forest methodology, to relate gene expression (as derived from expression microarrays) to sequence features residing in gene promoters (as derived from DNA motif data) and transcription factor binding to gene promoters (as derived from tiling microarrays). We extend the random forest approach to model a multivariate response as represented, for example, by time-course gene expression measures. An analysis of the multivariate random forest output reveals complex regulatory networks, which consist of cohesive, condition-dependent regulatory cliques. Each regulatory clique features homogeneous gene expression profiles and common motifs or synergistic motif groups. We apply our method to several yeast physiological processes: cell cycle, sporulation, and various stress conditions. Our technique displays excellent performance with regard to identifying known regulatory motifs, including high order interactions. In addition, we present evidence of the existence of an alternative MCB-binding pathway, which we confirm using data from two independent cell cycle studies and two other physioloigical processes. Finally, we have uncovered elaborate transcription regulation refinement mechanisms involving PAC and mRRPE motifs that govern essential rRNA processing. These include intriguing instances of differing motif dosages and differing combinatorial motif control that promote regulatory specificity in rRNA metabolism under differing physiological processes.
format Text
id pubmed-2691601
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-26916012009-06-19 Identification of Yeast Transcriptional Regulation Networks Using Multivariate Random Forests Xiao, Yuanyuan Segal, Mark R. PLoS Comput Biol Research Article The recent availability of whole-genome scale data sets that investigate complementary and diverse aspects of transcriptional regulation has spawned an increased need for new and effective computational approaches to analyze and integrate these large scale assays. Here, we propose a novel algorithm, based on random forest methodology, to relate gene expression (as derived from expression microarrays) to sequence features residing in gene promoters (as derived from DNA motif data) and transcription factor binding to gene promoters (as derived from tiling microarrays). We extend the random forest approach to model a multivariate response as represented, for example, by time-course gene expression measures. An analysis of the multivariate random forest output reveals complex regulatory networks, which consist of cohesive, condition-dependent regulatory cliques. Each regulatory clique features homogeneous gene expression profiles and common motifs or synergistic motif groups. We apply our method to several yeast physiological processes: cell cycle, sporulation, and various stress conditions. Our technique displays excellent performance with regard to identifying known regulatory motifs, including high order interactions. In addition, we present evidence of the existence of an alternative MCB-binding pathway, which we confirm using data from two independent cell cycle studies and two other physioloigical processes. Finally, we have uncovered elaborate transcription regulation refinement mechanisms involving PAC and mRRPE motifs that govern essential rRNA processing. These include intriguing instances of differing motif dosages and differing combinatorial motif control that promote regulatory specificity in rRNA metabolism under differing physiological processes. Public Library of Science 2009-06-19 /pmc/articles/PMC2691601/ /pubmed/19543377 http://dx.doi.org/10.1371/journal.pcbi.1000414 Text en Xiao, Segal. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Xiao, Yuanyuan
Segal, Mark R.
Identification of Yeast Transcriptional Regulation Networks Using Multivariate Random Forests
title Identification of Yeast Transcriptional Regulation Networks Using Multivariate Random Forests
title_full Identification of Yeast Transcriptional Regulation Networks Using Multivariate Random Forests
title_fullStr Identification of Yeast Transcriptional Regulation Networks Using Multivariate Random Forests
title_full_unstemmed Identification of Yeast Transcriptional Regulation Networks Using Multivariate Random Forests
title_short Identification of Yeast Transcriptional Regulation Networks Using Multivariate Random Forests
title_sort identification of yeast transcriptional regulation networks using multivariate random forests
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2691601/
https://www.ncbi.nlm.nih.gov/pubmed/19543377
http://dx.doi.org/10.1371/journal.pcbi.1000414
work_keys_str_mv AT xiaoyuanyuan identificationofyeasttranscriptionalregulationnetworksusingmultivariaterandomforests
AT segalmarkr identificationofyeasttranscriptionalregulationnetworksusingmultivariaterandomforests