Cargando…
RegulonDB 11.0: Comprehensive high-throughput datasets on transcriptional regulation in Escherichia coli K-12
Genomics has set the basis for a variety of methodologies that produce high-throughput datasets identifying the different players that define gene regulation, particularly regulation of transcription initiation and operon organization. These datasets are available in public repositories, such as the...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Microbiology Society
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9465075/ https://www.ncbi.nlm.nih.gov/pubmed/35584008 http://dx.doi.org/10.1099/mgen.0.000833 |
_version_ | 1784787711604817920 |
---|---|
author | Tierrafría, Víctor H. Rioualen, Claire Salgado, Heladia Lara, Paloma Gama-Castro, Socorro Lally, Patrick Gómez-Romero, Laura Peña-Loredo, Pablo López-Almazo, Andrés G. Alarcón-Carranza, Gabriel Betancourt-Figueroa, Felipe Alquicira-Hernández, Shirley Polanco-Morelos, J. Enrique García-Sotelo, Jair Gaytan-Nuñez, Estefani Méndez-Cruz, Carlos-Francisco Muñiz, Luis J. Bonavides-Martínez, César Moreno-Hagelsieb, Gabriel Galagan, James E. Wade, Joseph T. Collado-Vides, Julio |
author_facet | Tierrafría, Víctor H. Rioualen, Claire Salgado, Heladia Lara, Paloma Gama-Castro, Socorro Lally, Patrick Gómez-Romero, Laura Peña-Loredo, Pablo López-Almazo, Andrés G. Alarcón-Carranza, Gabriel Betancourt-Figueroa, Felipe Alquicira-Hernández, Shirley Polanco-Morelos, J. Enrique García-Sotelo, Jair Gaytan-Nuñez, Estefani Méndez-Cruz, Carlos-Francisco Muñiz, Luis J. Bonavides-Martínez, César Moreno-Hagelsieb, Gabriel Galagan, James E. Wade, Joseph T. Collado-Vides, Julio |
author_sort | Tierrafría, Víctor H. |
collection | PubMed |
description | Genomics has set the basis for a variety of methodologies that produce high-throughput datasets identifying the different players that define gene regulation, particularly regulation of transcription initiation and operon organization. These datasets are available in public repositories, such as the Gene Expression Omnibus, or ArrayExpress. However, accessing and navigating such a wealth of data is not straightforward. No resource currently exists that offers all available high and low-throughput data on transcriptional regulation in Escherichia coli K-12 to easily use both as whole datasets, or as individual interactions and regulatory elements. RegulonDB (https://regulondb.ccg.unam.mx) began gathering high-throughput dataset collections in 2009, starting with transcription start sites, then adding ChIP-seq and gSELEX in 2012, with up to 99 different experimental high-throughput datasets available in 2019. In this paper we present a radical upgrade to more than 2000 high-throughput datasets, processed to facilitate their comparison, introducing up-to-date collections of transcription termination sites, transcription units, as well as transcription factor binding interactions derived from ChIP-seq, ChIP-exo, gSELEX and DAP-seq experiments, besides expression profiles derived from RNA-seq experiments. For ChIP-seq experiments we offer both the data as presented by the authors, as well as data uniformly processed in-house, enhancing their comparability, as well as the traceability of the methods and reproducibility of the results. Furthermore, we have expanded the tools available for browsing and visualization across and within datasets. We include comparisons against previously existing knowledge in RegulonDB from classic experiments, a nucleotide-resolution genome viewer, and an interface that enables users to browse datasets by querying their metadata. A particular effort was made to automatically extract detailed experimental growth conditions by implementing an assisted curation strategy applying Natural language processing and machine learning. We provide summaries with the total number of interactions found in each experiment, as well as tools to identify common results among different experiments. This is a long-awaited resource to make use of such wealth of knowledge and advance our understanding of the biology of the model bacterium E. coli K-12. |
format | Online Article Text |
id | pubmed-9465075 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Microbiology Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-94650752022-09-12 RegulonDB 11.0: Comprehensive high-throughput datasets on transcriptional regulation in Escherichia coli K-12 Tierrafría, Víctor H. Rioualen, Claire Salgado, Heladia Lara, Paloma Gama-Castro, Socorro Lally, Patrick Gómez-Romero, Laura Peña-Loredo, Pablo López-Almazo, Andrés G. Alarcón-Carranza, Gabriel Betancourt-Figueroa, Felipe Alquicira-Hernández, Shirley Polanco-Morelos, J. Enrique García-Sotelo, Jair Gaytan-Nuñez, Estefani Méndez-Cruz, Carlos-Francisco Muñiz, Luis J. Bonavides-Martínez, César Moreno-Hagelsieb, Gabriel Galagan, James E. Wade, Joseph T. Collado-Vides, Julio Microb Genom Research Articles Genomics has set the basis for a variety of methodologies that produce high-throughput datasets identifying the different players that define gene regulation, particularly regulation of transcription initiation and operon organization. These datasets are available in public repositories, such as the Gene Expression Omnibus, or ArrayExpress. However, accessing and navigating such a wealth of data is not straightforward. No resource currently exists that offers all available high and low-throughput data on transcriptional regulation in Escherichia coli K-12 to easily use both as whole datasets, or as individual interactions and regulatory elements. RegulonDB (https://regulondb.ccg.unam.mx) began gathering high-throughput dataset collections in 2009, starting with transcription start sites, then adding ChIP-seq and gSELEX in 2012, with up to 99 different experimental high-throughput datasets available in 2019. In this paper we present a radical upgrade to more than 2000 high-throughput datasets, processed to facilitate their comparison, introducing up-to-date collections of transcription termination sites, transcription units, as well as transcription factor binding interactions derived from ChIP-seq, ChIP-exo, gSELEX and DAP-seq experiments, besides expression profiles derived from RNA-seq experiments. For ChIP-seq experiments we offer both the data as presented by the authors, as well as data uniformly processed in-house, enhancing their comparability, as well as the traceability of the methods and reproducibility of the results. Furthermore, we have expanded the tools available for browsing and visualization across and within datasets. We include comparisons against previously existing knowledge in RegulonDB from classic experiments, a nucleotide-resolution genome viewer, and an interface that enables users to browse datasets by querying their metadata. A particular effort was made to automatically extract detailed experimental growth conditions by implementing an assisted curation strategy applying Natural language processing and machine learning. We provide summaries with the total number of interactions found in each experiment, as well as tools to identify common results among different experiments. This is a long-awaited resource to make use of such wealth of knowledge and advance our understanding of the biology of the model bacterium E. coli K-12. Microbiology Society 2022-05-18 /pmc/articles/PMC9465075/ /pubmed/35584008 http://dx.doi.org/10.1099/mgen.0.000833 Text en © 2022 The Authors https://creativecommons.org/licenses/by-nc/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License. |
spellingShingle | Research Articles Tierrafría, Víctor H. Rioualen, Claire Salgado, Heladia Lara, Paloma Gama-Castro, Socorro Lally, Patrick Gómez-Romero, Laura Peña-Loredo, Pablo López-Almazo, Andrés G. Alarcón-Carranza, Gabriel Betancourt-Figueroa, Felipe Alquicira-Hernández, Shirley Polanco-Morelos, J. Enrique García-Sotelo, Jair Gaytan-Nuñez, Estefani Méndez-Cruz, Carlos-Francisco Muñiz, Luis J. Bonavides-Martínez, César Moreno-Hagelsieb, Gabriel Galagan, James E. Wade, Joseph T. Collado-Vides, Julio RegulonDB 11.0: Comprehensive high-throughput datasets on transcriptional regulation in Escherichia coli K-12 |
title | RegulonDB 11.0: Comprehensive high-throughput datasets on transcriptional regulation in Escherichia coli K-12 |
title_full | RegulonDB 11.0: Comprehensive high-throughput datasets on transcriptional regulation in Escherichia coli K-12 |
title_fullStr | RegulonDB 11.0: Comprehensive high-throughput datasets on transcriptional regulation in Escherichia coli K-12 |
title_full_unstemmed | RegulonDB 11.0: Comprehensive high-throughput datasets on transcriptional regulation in Escherichia coli K-12 |
title_short | RegulonDB 11.0: Comprehensive high-throughput datasets on transcriptional regulation in Escherichia coli K-12 |
title_sort | regulondb 11.0: comprehensive high-throughput datasets on transcriptional regulation in escherichia coli k-12 |
topic | Research Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9465075/ https://www.ncbi.nlm.nih.gov/pubmed/35584008 http://dx.doi.org/10.1099/mgen.0.000833 |
work_keys_str_mv | AT tierrafriavictorh regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT rioualenclaire regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT salgadoheladia regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT larapaloma regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT gamacastrosocorro regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT lallypatrick regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT gomezromerolaura regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT penaloredopablo regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT lopezalmazoandresg regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT alarconcarranzagabriel regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT betancourtfigueroafelipe regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT alquicirahernandezshirley regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT polancomorelosjenrique regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT garciasotelojair regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT gaytannunezestefani regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT mendezcruzcarlosfrancisco regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT munizluisj regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT bonavidesmartinezcesar regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT morenohagelsiebgabriel regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT galaganjamese regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT wadejosepht regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 AT colladovidesjulio regulondb110comprehensivehighthroughputdatasetsontranscriptionalregulationinescherichiacolik12 |