Cargando…

A simple metric of promoter architecture robustly predicts expression breadth of human genes suggesting that most transcription factors are positive regulators

BACKGROUND: Conventional wisdom holds that, owing to the dominance of features such as chromatin level control, the expression of a gene cannot be readily predicted from knowledge of promoter architecture. This is reflected, for example, in a weak or absent correlation between promoter divergence an...

Descripción completa

Detalles Bibliográficos
Autores principales: Hurst, Laurence D, Sachenkova, Oxana, Daub, Carsten, Forrest, Alistair RR, Huminiecki, Lukasz
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4310617/
https://www.ncbi.nlm.nih.gov/pubmed/25079787
http://dx.doi.org/10.1186/s13059-014-0413-3
_version_ 1782354895989374976
author Hurst, Laurence D
Sachenkova, Oxana
Daub, Carsten
Forrest, Alistair RR
Huminiecki, Lukasz
author_facet Hurst, Laurence D
Sachenkova, Oxana
Daub, Carsten
Forrest, Alistair RR
Huminiecki, Lukasz
author_sort Hurst, Laurence D
collection PubMed
description BACKGROUND: Conventional wisdom holds that, owing to the dominance of features such as chromatin level control, the expression of a gene cannot be readily predicted from knowledge of promoter architecture. This is reflected, for example, in a weak or absent correlation between promoter divergence and expression divergence between paralogs. However, an inability to predict may reflect an inability to accurately measure or employment of the wrong parameters. Here we address this issue through integration of two exceptional resources: ENCODE data on transcription factor binding and the FANTOM5 high-resolution expression atlas. RESULTS: Consistent with the notion that in eukaryotes most transcription factors are activating, the number of transcription factors binding a promoter is a strong predictor of expression breadth. In addition, evolutionarily young duplicates have fewer transcription factor binders and narrower expression. Nonetheless, we find several binders and cooperative sets that are disproportionately associated with broad expression, indicating that models more complex than simple correlations should hold more predictive power. Indeed, a machine learning approach improves fit to the data compared with a simple correlation. Machine learning could at best moderately predict tissue of expression of tissue specific genes. CONCLUSIONS: We find robust evidence that some expression parameters and paralog expression divergence are strongly predictable with knowledge of transcription factor binding repertoire. While some cooperative complexes can be identified, consistent with the notion that most eukaryotic transcription factors are activating, a simple predictor, the number of binding transcription factors found on a promoter, is a robust predictor of expression breadth. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-014-0413-3) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4310617
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43106172015-01-30 A simple metric of promoter architecture robustly predicts expression breadth of human genes suggesting that most transcription factors are positive regulators Hurst, Laurence D Sachenkova, Oxana Daub, Carsten Forrest, Alistair RR Huminiecki, Lukasz Genome Biol Research BACKGROUND: Conventional wisdom holds that, owing to the dominance of features such as chromatin level control, the expression of a gene cannot be readily predicted from knowledge of promoter architecture. This is reflected, for example, in a weak or absent correlation between promoter divergence and expression divergence between paralogs. However, an inability to predict may reflect an inability to accurately measure or employment of the wrong parameters. Here we address this issue through integration of two exceptional resources: ENCODE data on transcription factor binding and the FANTOM5 high-resolution expression atlas. RESULTS: Consistent with the notion that in eukaryotes most transcription factors are activating, the number of transcription factors binding a promoter is a strong predictor of expression breadth. In addition, evolutionarily young duplicates have fewer transcription factor binders and narrower expression. Nonetheless, we find several binders and cooperative sets that are disproportionately associated with broad expression, indicating that models more complex than simple correlations should hold more predictive power. Indeed, a machine learning approach improves fit to the data compared with a simple correlation. Machine learning could at best moderately predict tissue of expression of tissue specific genes. CONCLUSIONS: We find robust evidence that some expression parameters and paralog expression divergence are strongly predictable with knowledge of transcription factor binding repertoire. While some cooperative complexes can be identified, consistent with the notion that most eukaryotic transcription factors are activating, a simple predictor, the number of binding transcription factors found on a promoter, is a robust predictor of expression breadth. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-014-0413-3) contains supplementary material, which is available to authorized users. BioMed Central 2014-07-31 2014 /pmc/articles/PMC4310617/ /pubmed/25079787 http://dx.doi.org/10.1186/s13059-014-0413-3 Text en © Hurst et al.; licensee BioMed Central Ltd. 2014 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Hurst, Laurence D
Sachenkova, Oxana
Daub, Carsten
Forrest, Alistair RR
Huminiecki, Lukasz
A simple metric of promoter architecture robustly predicts expression breadth of human genes suggesting that most transcription factors are positive regulators
title A simple metric of promoter architecture robustly predicts expression breadth of human genes suggesting that most transcription factors are positive regulators
title_full A simple metric of promoter architecture robustly predicts expression breadth of human genes suggesting that most transcription factors are positive regulators
title_fullStr A simple metric of promoter architecture robustly predicts expression breadth of human genes suggesting that most transcription factors are positive regulators
title_full_unstemmed A simple metric of promoter architecture robustly predicts expression breadth of human genes suggesting that most transcription factors are positive regulators
title_short A simple metric of promoter architecture robustly predicts expression breadth of human genes suggesting that most transcription factors are positive regulators
title_sort simple metric of promoter architecture robustly predicts expression breadth of human genes suggesting that most transcription factors are positive regulators
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4310617/
https://www.ncbi.nlm.nih.gov/pubmed/25079787
http://dx.doi.org/10.1186/s13059-014-0413-3
work_keys_str_mv AT hurstlaurenced asimplemetricofpromoterarchitecturerobustlypredictsexpressionbreadthofhumangenessuggestingthatmosttranscriptionfactorsarepositiveregulators
AT sachenkovaoxana asimplemetricofpromoterarchitecturerobustlypredictsexpressionbreadthofhumangenessuggestingthatmosttranscriptionfactorsarepositiveregulators
AT daubcarsten asimplemetricofpromoterarchitecturerobustlypredictsexpressionbreadthofhumangenessuggestingthatmosttranscriptionfactorsarepositiveregulators
AT forrestalistairrr asimplemetricofpromoterarchitecturerobustlypredictsexpressionbreadthofhumangenessuggestingthatmosttranscriptionfactorsarepositiveregulators
AT asimplemetricofpromoterarchitecturerobustlypredictsexpressionbreadthofhumangenessuggestingthatmosttranscriptionfactorsarepositiveregulators
AT huminieckilukasz asimplemetricofpromoterarchitecturerobustlypredictsexpressionbreadthofhumangenessuggestingthatmosttranscriptionfactorsarepositiveregulators
AT hurstlaurenced simplemetricofpromoterarchitecturerobustlypredictsexpressionbreadthofhumangenessuggestingthatmosttranscriptionfactorsarepositiveregulators
AT sachenkovaoxana simplemetricofpromoterarchitecturerobustlypredictsexpressionbreadthofhumangenessuggestingthatmosttranscriptionfactorsarepositiveregulators
AT daubcarsten simplemetricofpromoterarchitecturerobustlypredictsexpressionbreadthofhumangenessuggestingthatmosttranscriptionfactorsarepositiveregulators
AT forrestalistairrr simplemetricofpromoterarchitecturerobustlypredictsexpressionbreadthofhumangenessuggestingthatmosttranscriptionfactorsarepositiveregulators
AT simplemetricofpromoterarchitecturerobustlypredictsexpressionbreadthofhumangenessuggestingthatmosttranscriptionfactorsarepositiveregulators
AT huminieckilukasz simplemetricofpromoterarchitecturerobustlypredictsexpressionbreadthofhumangenessuggestingthatmosttranscriptionfactorsarepositiveregulators