Cargando…

UniBind: maps of high-confidence direct TF-DNA interactions across nine species

BACKGROUND: Transcription factors (TFs) bind specifically to TF binding sites (TFBSs) at cis-regulatory regions to control transcription. It is critical to locate these TF-DNA interactions to understand transcriptional regulation. Efforts to predict bona fide TFBSs benefit from the availability of e...

Descripción completa

Detalles Bibliográficos
Autores principales: Puig, Rafael Riudavets, Boddie, Paul, Khan, Aziz, Castro-Mondragon, Jaime Abraham, Mathelier, Anthony
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8236138/
https://www.ncbi.nlm.nih.gov/pubmed/34174819
http://dx.doi.org/10.1186/s12864-021-07760-6
_version_ 1783714476461129728
author Puig, Rafael Riudavets
Boddie, Paul
Khan, Aziz
Castro-Mondragon, Jaime Abraham
Mathelier, Anthony
author_facet Puig, Rafael Riudavets
Boddie, Paul
Khan, Aziz
Castro-Mondragon, Jaime Abraham
Mathelier, Anthony
author_sort Puig, Rafael Riudavets
collection PubMed
description BACKGROUND: Transcription factors (TFs) bind specifically to TF binding sites (TFBSs) at cis-regulatory regions to control transcription. It is critical to locate these TF-DNA interactions to understand transcriptional regulation. Efforts to predict bona fide TFBSs benefit from the availability of experimental data mapping DNA binding regions of TFs (chromatin immunoprecipitation followed by sequencing - ChIP-seq). RESULTS: In this study, we processed ~ 10,000 public ChIP-seq datasets from nine species to provide high-quality TFBS predictions. After quality control, it culminated with the prediction of ~ 56 million TFBSs with experimental and computational support for direct TF-DNA interactions for 644 TFs in > 1000 cell lines and tissues. These TFBSs were used to predict > 197,000 cis-regulatory modules representing clusters of binding events in the corresponding genomes. The high-quality of the TFBSs was reinforced by their evolutionary conservation, enrichment at active cis-regulatory regions, and capacity to predict combinatorial binding of TFs. Further, we confirmed that the cell type and tissue specificity of enhancer activity was correlated with the number of TFs with binding sites predicted in these regions. All the data is provided to the community through the UniBind database that can be accessed through its web-interface (https://unibind.uio.no/), a dedicated RESTful API, and as genomic tracks. Finally, we provide an enrichment tool, available as a web-service and an R package, for users to find TFs with enriched TFBSs in a set of provided genomic regions. CONCLUSIONS: UniBind is the first resource of its kind, providing the largest collection of high-confidence direct TF-DNA interactions in nine species. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-07760-6.
format Online
Article
Text
id pubmed-8236138
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-82361382021-06-28 UniBind: maps of high-confidence direct TF-DNA interactions across nine species Puig, Rafael Riudavets Boddie, Paul Khan, Aziz Castro-Mondragon, Jaime Abraham Mathelier, Anthony BMC Genomics Research BACKGROUND: Transcription factors (TFs) bind specifically to TF binding sites (TFBSs) at cis-regulatory regions to control transcription. It is critical to locate these TF-DNA interactions to understand transcriptional regulation. Efforts to predict bona fide TFBSs benefit from the availability of experimental data mapping DNA binding regions of TFs (chromatin immunoprecipitation followed by sequencing - ChIP-seq). RESULTS: In this study, we processed ~ 10,000 public ChIP-seq datasets from nine species to provide high-quality TFBS predictions. After quality control, it culminated with the prediction of ~ 56 million TFBSs with experimental and computational support for direct TF-DNA interactions for 644 TFs in > 1000 cell lines and tissues. These TFBSs were used to predict > 197,000 cis-regulatory modules representing clusters of binding events in the corresponding genomes. The high-quality of the TFBSs was reinforced by their evolutionary conservation, enrichment at active cis-regulatory regions, and capacity to predict combinatorial binding of TFs. Further, we confirmed that the cell type and tissue specificity of enhancer activity was correlated with the number of TFs with binding sites predicted in these regions. All the data is provided to the community through the UniBind database that can be accessed through its web-interface (https://unibind.uio.no/), a dedicated RESTful API, and as genomic tracks. Finally, we provide an enrichment tool, available as a web-service and an R package, for users to find TFs with enriched TFBSs in a set of provided genomic regions. CONCLUSIONS: UniBind is the first resource of its kind, providing the largest collection of high-confidence direct TF-DNA interactions in nine species. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-07760-6. BioMed Central 2021-06-26 /pmc/articles/PMC8236138/ /pubmed/34174819 http://dx.doi.org/10.1186/s12864-021-07760-6 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Puig, Rafael Riudavets
Boddie, Paul
Khan, Aziz
Castro-Mondragon, Jaime Abraham
Mathelier, Anthony
UniBind: maps of high-confidence direct TF-DNA interactions across nine species
title UniBind: maps of high-confidence direct TF-DNA interactions across nine species
title_full UniBind: maps of high-confidence direct TF-DNA interactions across nine species
title_fullStr UniBind: maps of high-confidence direct TF-DNA interactions across nine species
title_full_unstemmed UniBind: maps of high-confidence direct TF-DNA interactions across nine species
title_short UniBind: maps of high-confidence direct TF-DNA interactions across nine species
title_sort unibind: maps of high-confidence direct tf-dna interactions across nine species
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8236138/
https://www.ncbi.nlm.nih.gov/pubmed/34174819
http://dx.doi.org/10.1186/s12864-021-07760-6
work_keys_str_mv AT puigrafaelriudavets unibindmapsofhighconfidencedirecttfdnainteractionsacrossninespecies
AT boddiepaul unibindmapsofhighconfidencedirecttfdnainteractionsacrossninespecies
AT khanaziz unibindmapsofhighconfidencedirecttfdnainteractionsacrossninespecies
AT castromondragonjaimeabraham unibindmapsofhighconfidencedirecttfdnainteractionsacrossninespecies
AT mathelieranthony unibindmapsofhighconfidencedirecttfdnainteractionsacrossninespecies