Cargando…

Most of the tight positional conservation of transcription factor binding sites near the transcription start site reflects their co-localization within regulatory modules

BACKGROUND: Transcription factors (TFs) form complexes that bind regulatory modules (RMs) within DNA, to control specific sets of genes. Some transcription factor binding sites (TFBSs) near the transcription start site (TSS) display tight positional preferences relative to the TSS. Furthermore, near...

Descripción completa

Detalles Bibliográficos
Autores principales: Acevedo-Luna, Natalia, Mariño-Ramírez, Leonardo, Halbert, Armand, Hansen, Ulla, Landsman, David, Spouge, John L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5117513/
https://www.ncbi.nlm.nih.gov/pubmed/27871221
http://dx.doi.org/10.1186/s12859-016-1354-5
_version_ 1782468817448861696
author Acevedo-Luna, Natalia
Mariño-Ramírez, Leonardo
Halbert, Armand
Hansen, Ulla
Landsman, David
Spouge, John L.
author_facet Acevedo-Luna, Natalia
Mariño-Ramírez, Leonardo
Halbert, Armand
Hansen, Ulla
Landsman, David
Spouge, John L.
author_sort Acevedo-Luna, Natalia
collection PubMed
description BACKGROUND: Transcription factors (TFs) form complexes that bind regulatory modules (RMs) within DNA, to control specific sets of genes. Some transcription factor binding sites (TFBSs) near the transcription start site (TSS) display tight positional preferences relative to the TSS. Furthermore, near the TSS, RMs can co-localize TFBSs with each other and the TSS. The proportion of TFBS positional preferences due to TFBS co-localization within RMs is unknown, however. ChIP experiments confirm co-localization of some TFBSs genome-wide, including near the TSS, but they typically examine only a few TFs at a time, using non-physiological conditions that can vary from lab to lab. In contrast, sequence analysis can examine many TFs uniformly and methodically, broadly surveying the co-localization of TFBSs with tight positional preferences relative to the TSS. RESULTS: Our statistics found 43 significant sets of human motifs in the JASPAR TF Database with positional preferences relative to the TSS, with 38 preferences tight (±5 bp). Each set of motifs corresponded to a gene group of 135 to 3304 genes, with 42/43 (98%) gene groups independently validated by DAVID, a gene ontology database, with FDR < 0.05. Motifs corresponding to two TFBSs in a RM should co-occur more than by chance alone, enriching the intersection of the gene groups corresponding to the two TFs. Thus, a gene-group intersection systematically enriched beyond chance alone provides evidence that the two TFs participate in an RM. Of the 903 = 43*42/2 intersections of the 43 significant gene groups, we found 768/903 (85%) pairs of gene groups with significantly enriched intersections, with 564/768 (73%) intersections independently validated by DAVID with FDR < 0.05. A user-friendly web site at http://go.usa.gov/3kjsH permits biologists to explore the interaction network of our TFBSs to identify candidate subunit RMs. CONCLUSIONS: Gene duplication and convergent evolution within a genome provide obvious biological mechanisms for replicating an RM near the TSS that binds a particular TF subunit. Of all intersections of our 43 significant gene groups, 85% were significantly enriched, with 73% of the significant enrichments independently validated by gene ontology. The co-localization of TFBSs within RMs therefore likely explains much of the tight TFBS positional preferences near the TSS. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1354-5) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5117513
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-51175132016-11-28 Most of the tight positional conservation of transcription factor binding sites near the transcription start site reflects their co-localization within regulatory modules Acevedo-Luna, Natalia Mariño-Ramírez, Leonardo Halbert, Armand Hansen, Ulla Landsman, David Spouge, John L. BMC Bioinformatics Research Article BACKGROUND: Transcription factors (TFs) form complexes that bind regulatory modules (RMs) within DNA, to control specific sets of genes. Some transcription factor binding sites (TFBSs) near the transcription start site (TSS) display tight positional preferences relative to the TSS. Furthermore, near the TSS, RMs can co-localize TFBSs with each other and the TSS. The proportion of TFBS positional preferences due to TFBS co-localization within RMs is unknown, however. ChIP experiments confirm co-localization of some TFBSs genome-wide, including near the TSS, but they typically examine only a few TFs at a time, using non-physiological conditions that can vary from lab to lab. In contrast, sequence analysis can examine many TFs uniformly and methodically, broadly surveying the co-localization of TFBSs with tight positional preferences relative to the TSS. RESULTS: Our statistics found 43 significant sets of human motifs in the JASPAR TF Database with positional preferences relative to the TSS, with 38 preferences tight (±5 bp). Each set of motifs corresponded to a gene group of 135 to 3304 genes, with 42/43 (98%) gene groups independently validated by DAVID, a gene ontology database, with FDR < 0.05. Motifs corresponding to two TFBSs in a RM should co-occur more than by chance alone, enriching the intersection of the gene groups corresponding to the two TFs. Thus, a gene-group intersection systematically enriched beyond chance alone provides evidence that the two TFs participate in an RM. Of the 903 = 43*42/2 intersections of the 43 significant gene groups, we found 768/903 (85%) pairs of gene groups with significantly enriched intersections, with 564/768 (73%) intersections independently validated by DAVID with FDR < 0.05. A user-friendly web site at http://go.usa.gov/3kjsH permits biologists to explore the interaction network of our TFBSs to identify candidate subunit RMs. CONCLUSIONS: Gene duplication and convergent evolution within a genome provide obvious biological mechanisms for replicating an RM near the TSS that binds a particular TF subunit. Of all intersections of our 43 significant gene groups, 85% were significantly enriched, with 73% of the significant enrichments independently validated by gene ontology. The co-localization of TFBSs within RMs therefore likely explains much of the tight TFBS positional preferences near the TSS. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1354-5) contains supplementary material, which is available to authorized users. BioMed Central 2016-11-21 /pmc/articles/PMC5117513/ /pubmed/27871221 http://dx.doi.org/10.1186/s12859-016-1354-5 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Acevedo-Luna, Natalia
Mariño-Ramírez, Leonardo
Halbert, Armand
Hansen, Ulla
Landsman, David
Spouge, John L.
Most of the tight positional conservation of transcription factor binding sites near the transcription start site reflects their co-localization within regulatory modules
title Most of the tight positional conservation of transcription factor binding sites near the transcription start site reflects their co-localization within regulatory modules
title_full Most of the tight positional conservation of transcription factor binding sites near the transcription start site reflects their co-localization within regulatory modules
title_fullStr Most of the tight positional conservation of transcription factor binding sites near the transcription start site reflects their co-localization within regulatory modules
title_full_unstemmed Most of the tight positional conservation of transcription factor binding sites near the transcription start site reflects their co-localization within regulatory modules
title_short Most of the tight positional conservation of transcription factor binding sites near the transcription start site reflects their co-localization within regulatory modules
title_sort most of the tight positional conservation of transcription factor binding sites near the transcription start site reflects their co-localization within regulatory modules
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5117513/
https://www.ncbi.nlm.nih.gov/pubmed/27871221
http://dx.doi.org/10.1186/s12859-016-1354-5
work_keys_str_mv AT acevedolunanatalia mostofthetightpositionalconservationoftranscriptionfactorbindingsitesnearthetranscriptionstartsitereflectstheircolocalizationwithinregulatorymodules
AT marinoramirezleonardo mostofthetightpositionalconservationoftranscriptionfactorbindingsitesnearthetranscriptionstartsitereflectstheircolocalizationwithinregulatorymodules
AT halbertarmand mostofthetightpositionalconservationoftranscriptionfactorbindingsitesnearthetranscriptionstartsitereflectstheircolocalizationwithinregulatorymodules
AT hansenulla mostofthetightpositionalconservationoftranscriptionfactorbindingsitesnearthetranscriptionstartsitereflectstheircolocalizationwithinregulatorymodules
AT landsmandavid mostofthetightpositionalconservationoftranscriptionfactorbindingsitesnearthetranscriptionstartsitereflectstheircolocalizationwithinregulatorymodules
AT spougejohnl mostofthetightpositionalconservationoftranscriptionfactorbindingsitesnearthetranscriptionstartsitereflectstheircolocalizationwithinregulatorymodules