Cargando…

Genome-Wide Identification of Transcription Start Sites, Promoters and Transcription Factor Binding Sites in E. coli

Despite almost 40 years of molecular genetics research in Escherichia coli a major fraction of its Transcription Start Sites (TSSs) are still unknown, limiting therefore our understanding of the regulatory circuits that control gene expression in this model organism. RegulonDB (http://regulondb.ccg....

Descripción completa

Detalles Bibliográficos
Autores principales: Mendoza-Vargas, Alfredo, Olvera, Leticia, Olvera, Maricela, Grande, Ricardo, Vega-Alvarado, Leticia, Taboada, Blanca, Jimenez-Jacinto, Verónica, Salgado, Heladia, Juárez, Katy, Contreras-Moreira, Bruno, Huerta, Araceli M., Collado-Vides, Julio, Morett, Enrique
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2760140/
https://www.ncbi.nlm.nih.gov/pubmed/19838305
http://dx.doi.org/10.1371/journal.pone.0007526
_version_ 1782172723824295936
author Mendoza-Vargas, Alfredo
Olvera, Leticia
Olvera, Maricela
Grande, Ricardo
Vega-Alvarado, Leticia
Taboada, Blanca
Jimenez-Jacinto, Verónica
Salgado, Heladia
Juárez, Katy
Contreras-Moreira, Bruno
Huerta, Araceli M.
Collado-Vides, Julio
Morett, Enrique
author_facet Mendoza-Vargas, Alfredo
Olvera, Leticia
Olvera, Maricela
Grande, Ricardo
Vega-Alvarado, Leticia
Taboada, Blanca
Jimenez-Jacinto, Verónica
Salgado, Heladia
Juárez, Katy
Contreras-Moreira, Bruno
Huerta, Araceli M.
Collado-Vides, Julio
Morett, Enrique
author_sort Mendoza-Vargas, Alfredo
collection PubMed
description Despite almost 40 years of molecular genetics research in Escherichia coli a major fraction of its Transcription Start Sites (TSSs) are still unknown, limiting therefore our understanding of the regulatory circuits that control gene expression in this model organism. RegulonDB (http://regulondb.ccg.unam.mx/) is aimed at integrating the genetic regulatory network of E. coli K12 as an entirely bioinformatic project up till now. In this work, we extended its aims by generating experimental data at a genome scale on TSSs, promoters and regulatory regions. We implemented a modified 5′ RACE protocol and an unbiased High Throughput Pyrosequencing Strategy (HTPS) that allowed us to map more than 1700 TSSs with high precision. From this collection, about 230 corresponded to previously reported TSSs, which helped us to benchmark both our methodologies and the accuracy of the previous mapping experiments. The other ca 1500 TSSs mapped belong to about 1000 different genes, many of them with no assigned function. We identified promoter sequences and type of σ factors that control the expression of about 80% of these genes. As expected, the housekeeping σ(70) was the most common type of promoter, followed by σ(38). The majority of the putative TSSs were located between 20 to 40 nucleotides from the translational start site. Putative regulatory binding sites for transcription factors were detected upstream of many TSSs. For a few transcripts, riboswitches and small RNAs were found. Several genes also had additional TSSs within the coding region. Unexpectedly, the HTPS experiments revealed extensive antisense transcription, probably for regulatory functions. The new information in RegulonDB, now with more than 2400 experimentally determined TSSs, strengthens the accuracy of promoter prediction, operon structure, and regulatory networks and provides valuable new information that will facilitate the understanding from a global perspective the complex and intricate regulatory network that operates in E. coli.
format Text
id pubmed-2760140
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-27601402009-10-19 Genome-Wide Identification of Transcription Start Sites, Promoters and Transcription Factor Binding Sites in E. coli Mendoza-Vargas, Alfredo Olvera, Leticia Olvera, Maricela Grande, Ricardo Vega-Alvarado, Leticia Taboada, Blanca Jimenez-Jacinto, Verónica Salgado, Heladia Juárez, Katy Contreras-Moreira, Bruno Huerta, Araceli M. Collado-Vides, Julio Morett, Enrique PLoS One Research Article Despite almost 40 years of molecular genetics research in Escherichia coli a major fraction of its Transcription Start Sites (TSSs) are still unknown, limiting therefore our understanding of the regulatory circuits that control gene expression in this model organism. RegulonDB (http://regulondb.ccg.unam.mx/) is aimed at integrating the genetic regulatory network of E. coli K12 as an entirely bioinformatic project up till now. In this work, we extended its aims by generating experimental data at a genome scale on TSSs, promoters and regulatory regions. We implemented a modified 5′ RACE protocol and an unbiased High Throughput Pyrosequencing Strategy (HTPS) that allowed us to map more than 1700 TSSs with high precision. From this collection, about 230 corresponded to previously reported TSSs, which helped us to benchmark both our methodologies and the accuracy of the previous mapping experiments. The other ca 1500 TSSs mapped belong to about 1000 different genes, many of them with no assigned function. We identified promoter sequences and type of σ factors that control the expression of about 80% of these genes. As expected, the housekeeping σ(70) was the most common type of promoter, followed by σ(38). The majority of the putative TSSs were located between 20 to 40 nucleotides from the translational start site. Putative regulatory binding sites for transcription factors were detected upstream of many TSSs. For a few transcripts, riboswitches and small RNAs were found. Several genes also had additional TSSs within the coding region. Unexpectedly, the HTPS experiments revealed extensive antisense transcription, probably for regulatory functions. The new information in RegulonDB, now with more than 2400 experimentally determined TSSs, strengthens the accuracy of promoter prediction, operon structure, and regulatory networks and provides valuable new information that will facilitate the understanding from a global perspective the complex and intricate regulatory network that operates in E. coli. Public Library of Science 2009-10-19 /pmc/articles/PMC2760140/ /pubmed/19838305 http://dx.doi.org/10.1371/journal.pone.0007526 Text en Mendoza-Vargas et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Mendoza-Vargas, Alfredo
Olvera, Leticia
Olvera, Maricela
Grande, Ricardo
Vega-Alvarado, Leticia
Taboada, Blanca
Jimenez-Jacinto, Verónica
Salgado, Heladia
Juárez, Katy
Contreras-Moreira, Bruno
Huerta, Araceli M.
Collado-Vides, Julio
Morett, Enrique
Genome-Wide Identification of Transcription Start Sites, Promoters and Transcription Factor Binding Sites in E. coli
title Genome-Wide Identification of Transcription Start Sites, Promoters and Transcription Factor Binding Sites in E. coli
title_full Genome-Wide Identification of Transcription Start Sites, Promoters and Transcription Factor Binding Sites in E. coli
title_fullStr Genome-Wide Identification of Transcription Start Sites, Promoters and Transcription Factor Binding Sites in E. coli
title_full_unstemmed Genome-Wide Identification of Transcription Start Sites, Promoters and Transcription Factor Binding Sites in E. coli
title_short Genome-Wide Identification of Transcription Start Sites, Promoters and Transcription Factor Binding Sites in E. coli
title_sort genome-wide identification of transcription start sites, promoters and transcription factor binding sites in e. coli
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2760140/
https://www.ncbi.nlm.nih.gov/pubmed/19838305
http://dx.doi.org/10.1371/journal.pone.0007526
work_keys_str_mv AT mendozavargasalfredo genomewideidentificationoftranscriptionstartsitespromotersandtranscriptionfactorbindingsitesinecoli
AT olveraleticia genomewideidentificationoftranscriptionstartsitespromotersandtranscriptionfactorbindingsitesinecoli
AT olveramaricela genomewideidentificationoftranscriptionstartsitespromotersandtranscriptionfactorbindingsitesinecoli
AT grandericardo genomewideidentificationoftranscriptionstartsitespromotersandtranscriptionfactorbindingsitesinecoli
AT vegaalvaradoleticia genomewideidentificationoftranscriptionstartsitespromotersandtranscriptionfactorbindingsitesinecoli
AT taboadablanca genomewideidentificationoftranscriptionstartsitespromotersandtranscriptionfactorbindingsitesinecoli
AT jimenezjacintoveronica genomewideidentificationoftranscriptionstartsitespromotersandtranscriptionfactorbindingsitesinecoli
AT salgadoheladia genomewideidentificationoftranscriptionstartsitespromotersandtranscriptionfactorbindingsitesinecoli
AT juarezkaty genomewideidentificationoftranscriptionstartsitespromotersandtranscriptionfactorbindingsitesinecoli
AT contrerasmoreirabruno genomewideidentificationoftranscriptionstartsitespromotersandtranscriptionfactorbindingsitesinecoli
AT huertaaracelim genomewideidentificationoftranscriptionstartsitespromotersandtranscriptionfactorbindingsitesinecoli
AT colladovidesjulio genomewideidentificationoftranscriptionstartsitespromotersandtranscriptionfactorbindingsitesinecoli
AT morettenrique genomewideidentificationoftranscriptionstartsitespromotersandtranscriptionfactorbindingsitesinecoli