Cargando…
An ensemble learning approach to reverse-engineering transcriptional regulatory networks from time-series gene expression data
BACKGROUND: One of the most challenging tasks in the post-genomic era is to reconstruct the transcriptional regulatory networks. The goal is to reveal, for each gene that responds to a certain biological event, which transcription factors affect its expression, and how a set of transcription factors...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2709269/ https://www.ncbi.nlm.nih.gov/pubmed/19594885 http://dx.doi.org/10.1186/1471-2164-10-S1-S8 |
_version_ | 1782169287659618304 |
---|---|
author | Ruan, Jianhua Deng, Youping Perkins, Edward J Zhang, Weixiong |
author_facet | Ruan, Jianhua Deng, Youping Perkins, Edward J Zhang, Weixiong |
author_sort | Ruan, Jianhua |
collection | PubMed |
description | BACKGROUND: One of the most challenging tasks in the post-genomic era is to reconstruct the transcriptional regulatory networks. The goal is to reveal, for each gene that responds to a certain biological event, which transcription factors affect its expression, and how a set of transcription factors coordinate to accomplish temporal and spatial specific regulations. RESULTS: Here we propose a supervised machine learning approach to address these questions. We focus our study on the gene transcriptional regulation of the cell cycle in the budding yeast, thanks to the large amount of data available and relatively well-understood biology, although the main ideas of our method can be applied to other data as well. Our method starts with building an ensemble of decision trees for each microarray data to capture the association between the expression levels of yeast genes and the binding of transcription factors to gene promoter regions, as determined by chromatin immunoprecipitation microarray (ChIP-chip) experiment. Cross-validation experiments show that the method is more accurate and reliable than the naive decision tree algorithm and several other ensemble learning methods. From the decision tree ensembles, we extract logical rules that explain how a set of transcription factors act in concert to regulate the expression of their targets. We further compute a profile for each rule to show its regulation strengths at different time points. We also propose a spline interpolation method to integrate the rule profiles learned from several time series expression data sets that measure the same biological process. We then combine these rule profiles to build a transcriptional regulatory network for the yeast cell cycle. Compared to the results in the literature, our method correctly identifies all major known yeast cell cycle transcription factors, and assigns them into appropriate cell cycle phases. Our method also identifies many interesting synergetic relationships among these transcription factors, most of which are well known, while many of the rest can also be supported by other evidences. CONCLUSION: The high accuracy of our method indicates that our method is valid and robust. As more gene expression and transcription factor binding data become available, we believe that our method is useful for reconstructing large-scale transcriptional regulatory networks in other species as well. |
format | Text |
id | pubmed-2709269 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-27092692009-07-14 An ensemble learning approach to reverse-engineering transcriptional regulatory networks from time-series gene expression data Ruan, Jianhua Deng, Youping Perkins, Edward J Zhang, Weixiong BMC Genomics Research BACKGROUND: One of the most challenging tasks in the post-genomic era is to reconstruct the transcriptional regulatory networks. The goal is to reveal, for each gene that responds to a certain biological event, which transcription factors affect its expression, and how a set of transcription factors coordinate to accomplish temporal and spatial specific regulations. RESULTS: Here we propose a supervised machine learning approach to address these questions. We focus our study on the gene transcriptional regulation of the cell cycle in the budding yeast, thanks to the large amount of data available and relatively well-understood biology, although the main ideas of our method can be applied to other data as well. Our method starts with building an ensemble of decision trees for each microarray data to capture the association between the expression levels of yeast genes and the binding of transcription factors to gene promoter regions, as determined by chromatin immunoprecipitation microarray (ChIP-chip) experiment. Cross-validation experiments show that the method is more accurate and reliable than the naive decision tree algorithm and several other ensemble learning methods. From the decision tree ensembles, we extract logical rules that explain how a set of transcription factors act in concert to regulate the expression of their targets. We further compute a profile for each rule to show its regulation strengths at different time points. We also propose a spline interpolation method to integrate the rule profiles learned from several time series expression data sets that measure the same biological process. We then combine these rule profiles to build a transcriptional regulatory network for the yeast cell cycle. Compared to the results in the literature, our method correctly identifies all major known yeast cell cycle transcription factors, and assigns them into appropriate cell cycle phases. Our method also identifies many interesting synergetic relationships among these transcription factors, most of which are well known, while many of the rest can also be supported by other evidences. CONCLUSION: The high accuracy of our method indicates that our method is valid and robust. As more gene expression and transcription factor binding data become available, we believe that our method is useful for reconstructing large-scale transcriptional regulatory networks in other species as well. BioMed Central 2009-07-07 /pmc/articles/PMC2709269/ /pubmed/19594885 http://dx.doi.org/10.1186/1471-2164-10-S1-S8 Text en Copyright © 2009 Ruan et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Ruan, Jianhua Deng, Youping Perkins, Edward J Zhang, Weixiong An ensemble learning approach to reverse-engineering transcriptional regulatory networks from time-series gene expression data |
title | An ensemble learning approach to reverse-engineering transcriptional regulatory networks from time-series gene expression data |
title_full | An ensemble learning approach to reverse-engineering transcriptional regulatory networks from time-series gene expression data |
title_fullStr | An ensemble learning approach to reverse-engineering transcriptional regulatory networks from time-series gene expression data |
title_full_unstemmed | An ensemble learning approach to reverse-engineering transcriptional regulatory networks from time-series gene expression data |
title_short | An ensemble learning approach to reverse-engineering transcriptional regulatory networks from time-series gene expression data |
title_sort | ensemble learning approach to reverse-engineering transcriptional regulatory networks from time-series gene expression data |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2709269/ https://www.ncbi.nlm.nih.gov/pubmed/19594885 http://dx.doi.org/10.1186/1471-2164-10-S1-S8 |
work_keys_str_mv | AT ruanjianhua anensemblelearningapproachtoreverseengineeringtranscriptionalregulatorynetworksfromtimeseriesgeneexpressiondata AT dengyouping anensemblelearningapproachtoreverseengineeringtranscriptionalregulatorynetworksfromtimeseriesgeneexpressiondata AT perkinsedwardj anensemblelearningapproachtoreverseengineeringtranscriptionalregulatorynetworksfromtimeseriesgeneexpressiondata AT zhangweixiong anensemblelearningapproachtoreverseengineeringtranscriptionalregulatorynetworksfromtimeseriesgeneexpressiondata AT ruanjianhua ensemblelearningapproachtoreverseengineeringtranscriptionalregulatorynetworksfromtimeseriesgeneexpressiondata AT dengyouping ensemblelearningapproachtoreverseengineeringtranscriptionalregulatorynetworksfromtimeseriesgeneexpressiondata AT perkinsedwardj ensemblelearningapproachtoreverseengineeringtranscriptionalregulatorynetworksfromtimeseriesgeneexpressiondata AT zhangweixiong ensemblelearningapproachtoreverseengineeringtranscriptionalregulatorynetworksfromtimeseriesgeneexpressiondata |