Cargando…

A Semi-Supervised Method for Predicting Transcription Factor–Gene Interactions in Escherichia coli

While Escherichia coli has one of the most comprehensive datasets of experimentally verified transcriptional regulatory interactions of any organism, it is still far from complete. This presents a problem when trying to combine gene expression and regulatory interactions to model transcriptional reg...

Descripción completa

Detalles Bibliográficos
Autores principales: Ernst, Jason, Beg, Qasim K., Kay, Krin A., Balázsi, Gábor, Oltvai, Zoltán N., Bar-Joseph, Ziv
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2266799/
https://www.ncbi.nlm.nih.gov/pubmed/18369434
http://dx.doi.org/10.1371/journal.pcbi.1000044
_version_ 1782151564563054592
author Ernst, Jason
Beg, Qasim K.
Kay, Krin A.
Balázsi, Gábor
Oltvai, Zoltán N.
Bar-Joseph, Ziv
author_facet Ernst, Jason
Beg, Qasim K.
Kay, Krin A.
Balázsi, Gábor
Oltvai, Zoltán N.
Bar-Joseph, Ziv
author_sort Ernst, Jason
collection PubMed
description While Escherichia coli has one of the most comprehensive datasets of experimentally verified transcriptional regulatory interactions of any organism, it is still far from complete. This presents a problem when trying to combine gene expression and regulatory interactions to model transcriptional regulatory networks. Using the available regulatory interactions to predict new interactions may lead to better coverage and more accurate models. Here, we develop SEREND (SEmi-supervised REgulatory Network Discoverer), a semi-supervised learning method that uses a curated database of verified transcriptional factor–gene interactions, DNA sequence binding motifs, and a compendium of gene expression data in order to make thousands of new predictions about transcription factor–gene interactions, including whether the transcription factor activates or represses the gene. Using genome-wide binding datasets for several transcription factors, we demonstrate that our semi-supervised classification strategy improves the prediction of targets for a given transcription factor. To further demonstrate the utility of our inferred interactions, we generated a new microarray gene expression dataset for the aerobic to anaerobic shift response in E. coli. We used our inferred interactions with the verified interactions to reconstruct a dynamic regulatory network for this response. The network reconstructed when using our inferred interactions was better able to correctly identify known regulators and suggested additional activators and repressors as having important roles during the aerobic–anaerobic shift interface.
format Text
id pubmed-2266799
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-22667992008-03-28 A Semi-Supervised Method for Predicting Transcription Factor–Gene Interactions in Escherichia coli Ernst, Jason Beg, Qasim K. Kay, Krin A. Balázsi, Gábor Oltvai, Zoltán N. Bar-Joseph, Ziv PLoS Comput Biol Research Article While Escherichia coli has one of the most comprehensive datasets of experimentally verified transcriptional regulatory interactions of any organism, it is still far from complete. This presents a problem when trying to combine gene expression and regulatory interactions to model transcriptional regulatory networks. Using the available regulatory interactions to predict new interactions may lead to better coverage and more accurate models. Here, we develop SEREND (SEmi-supervised REgulatory Network Discoverer), a semi-supervised learning method that uses a curated database of verified transcriptional factor–gene interactions, DNA sequence binding motifs, and a compendium of gene expression data in order to make thousands of new predictions about transcription factor–gene interactions, including whether the transcription factor activates or represses the gene. Using genome-wide binding datasets for several transcription factors, we demonstrate that our semi-supervised classification strategy improves the prediction of targets for a given transcription factor. To further demonstrate the utility of our inferred interactions, we generated a new microarray gene expression dataset for the aerobic to anaerobic shift response in E. coli. We used our inferred interactions with the verified interactions to reconstruct a dynamic regulatory network for this response. The network reconstructed when using our inferred interactions was better able to correctly identify known regulators and suggested additional activators and repressors as having important roles during the aerobic–anaerobic shift interface. Public Library of Science 2008-03-28 /pmc/articles/PMC2266799/ /pubmed/18369434 http://dx.doi.org/10.1371/journal.pcbi.1000044 Text en Ernst et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Ernst, Jason
Beg, Qasim K.
Kay, Krin A.
Balázsi, Gábor
Oltvai, Zoltán N.
Bar-Joseph, Ziv
A Semi-Supervised Method for Predicting Transcription Factor–Gene Interactions in Escherichia coli
title A Semi-Supervised Method for Predicting Transcription Factor–Gene Interactions in Escherichia coli
title_full A Semi-Supervised Method for Predicting Transcription Factor–Gene Interactions in Escherichia coli
title_fullStr A Semi-Supervised Method for Predicting Transcription Factor–Gene Interactions in Escherichia coli
title_full_unstemmed A Semi-Supervised Method for Predicting Transcription Factor–Gene Interactions in Escherichia coli
title_short A Semi-Supervised Method for Predicting Transcription Factor–Gene Interactions in Escherichia coli
title_sort semi-supervised method for predicting transcription factor–gene interactions in escherichia coli
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2266799/
https://www.ncbi.nlm.nih.gov/pubmed/18369434
http://dx.doi.org/10.1371/journal.pcbi.1000044
work_keys_str_mv AT ernstjason asemisupervisedmethodforpredictingtranscriptionfactorgeneinteractionsinescherichiacoli
AT begqasimk asemisupervisedmethodforpredictingtranscriptionfactorgeneinteractionsinescherichiacoli
AT kaykrina asemisupervisedmethodforpredictingtranscriptionfactorgeneinteractionsinescherichiacoli
AT balazsigabor asemisupervisedmethodforpredictingtranscriptionfactorgeneinteractionsinescherichiacoli
AT oltvaizoltann asemisupervisedmethodforpredictingtranscriptionfactorgeneinteractionsinescherichiacoli
AT barjosephziv asemisupervisedmethodforpredictingtranscriptionfactorgeneinteractionsinescherichiacoli
AT ernstjason semisupervisedmethodforpredictingtranscriptionfactorgeneinteractionsinescherichiacoli
AT begqasimk semisupervisedmethodforpredictingtranscriptionfactorgeneinteractionsinescherichiacoli
AT kaykrina semisupervisedmethodforpredictingtranscriptionfactorgeneinteractionsinescherichiacoli
AT balazsigabor semisupervisedmethodforpredictingtranscriptionfactorgeneinteractionsinescherichiacoli
AT oltvaizoltann semisupervisedmethodforpredictingtranscriptionfactorgeneinteractionsinescherichiacoli
AT barjosephziv semisupervisedmethodforpredictingtranscriptionfactorgeneinteractionsinescherichiacoli