Cargando…

Transcription factors, coregulators, and epigenetic marks are linearly correlated and highly redundant

The DNA microstates that regulate transcription include sequence-specific transcription factors (TFs), coregulatory complexes, nucleosomes, histone modifications, DNA methylation, and parts of the three-dimensional architecture of genomes, which could create an enormous combinatorial complexity acro...

Descripción completa

Detalles Bibliográficos
Autores principales: Ahsendorf, Tobias, Müller, Franz-Josef, Topkar, Ved, Gunawardena, Jeremy, Eils, Roland
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5720766/
https://www.ncbi.nlm.nih.gov/pubmed/29216191
http://dx.doi.org/10.1371/journal.pone.0186324
_version_ 1783284725807316992
author Ahsendorf, Tobias
Müller, Franz-Josef
Topkar, Ved
Gunawardena, Jeremy
Eils, Roland
author_facet Ahsendorf, Tobias
Müller, Franz-Josef
Topkar, Ved
Gunawardena, Jeremy
Eils, Roland
author_sort Ahsendorf, Tobias
collection PubMed
description The DNA microstates that regulate transcription include sequence-specific transcription factors (TFs), coregulatory complexes, nucleosomes, histone modifications, DNA methylation, and parts of the three-dimensional architecture of genomes, which could create an enormous combinatorial complexity across the genome. However, many proteins and epigenetic marks are known to colocalize, suggesting that the information content encoded in these marks can be compressed. It has so far proved difficult to understand this compression in a systematic and quantitative manner. Here, we show that simple linear models can reliably predict the data generated by the ENCODE and Roadmap Epigenomics consortia. Further, we demonstrate that a small number of marks can predict all other marks with high average correlation across the genome, systematically revealing the substantial information compression that is present in different cell lines. We find that the linear models for activating marks are typically cell line-independent, while those for silencing marks are predominantly cell line-specific. Of particular note, a nuclear receptor corepressor, transducin beta-like 1 X-linked receptor 1 (TBLR1), was highly predictive of other marks in two hematopoietic cell lines. The methodology presented here shows how the potentially vast complexity of TFs, coregulators, and epigenetic marks at eukaryotic genes is highly redundant and that the information present can be compressed onto a much smaller subset of marks. These findings could be used to efficiently characterize cell lines and tissues based on a small number of diagnostic marks and suggest how the DNA microstates, which regulate the expression of individual genes, can be specified.
format Online
Article
Text
id pubmed-5720766
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-57207662017-12-15 Transcription factors, coregulators, and epigenetic marks are linearly correlated and highly redundant Ahsendorf, Tobias Müller, Franz-Josef Topkar, Ved Gunawardena, Jeremy Eils, Roland PLoS One Research Article The DNA microstates that regulate transcription include sequence-specific transcription factors (TFs), coregulatory complexes, nucleosomes, histone modifications, DNA methylation, and parts of the three-dimensional architecture of genomes, which could create an enormous combinatorial complexity across the genome. However, many proteins and epigenetic marks are known to colocalize, suggesting that the information content encoded in these marks can be compressed. It has so far proved difficult to understand this compression in a systematic and quantitative manner. Here, we show that simple linear models can reliably predict the data generated by the ENCODE and Roadmap Epigenomics consortia. Further, we demonstrate that a small number of marks can predict all other marks with high average correlation across the genome, systematically revealing the substantial information compression that is present in different cell lines. We find that the linear models for activating marks are typically cell line-independent, while those for silencing marks are predominantly cell line-specific. Of particular note, a nuclear receptor corepressor, transducin beta-like 1 X-linked receptor 1 (TBLR1), was highly predictive of other marks in two hematopoietic cell lines. The methodology presented here shows how the potentially vast complexity of TFs, coregulators, and epigenetic marks at eukaryotic genes is highly redundant and that the information present can be compressed onto a much smaller subset of marks. These findings could be used to efficiently characterize cell lines and tissues based on a small number of diagnostic marks and suggest how the DNA microstates, which regulate the expression of individual genes, can be specified. Public Library of Science 2017-12-07 /pmc/articles/PMC5720766/ /pubmed/29216191 http://dx.doi.org/10.1371/journal.pone.0186324 Text en © 2017 Ahsendorf et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Ahsendorf, Tobias
Müller, Franz-Josef
Topkar, Ved
Gunawardena, Jeremy
Eils, Roland
Transcription factors, coregulators, and epigenetic marks are linearly correlated and highly redundant
title Transcription factors, coregulators, and epigenetic marks are linearly correlated and highly redundant
title_full Transcription factors, coregulators, and epigenetic marks are linearly correlated and highly redundant
title_fullStr Transcription factors, coregulators, and epigenetic marks are linearly correlated and highly redundant
title_full_unstemmed Transcription factors, coregulators, and epigenetic marks are linearly correlated and highly redundant
title_short Transcription factors, coregulators, and epigenetic marks are linearly correlated and highly redundant
title_sort transcription factors, coregulators, and epigenetic marks are linearly correlated and highly redundant
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5720766/
https://www.ncbi.nlm.nih.gov/pubmed/29216191
http://dx.doi.org/10.1371/journal.pone.0186324
work_keys_str_mv AT ahsendorftobias transcriptionfactorscoregulatorsandepigeneticmarksarelinearlycorrelatedandhighlyredundant
AT mullerfranzjosef transcriptionfactorscoregulatorsandepigeneticmarksarelinearlycorrelatedandhighlyredundant
AT topkarved transcriptionfactorscoregulatorsandepigeneticmarksarelinearlycorrelatedandhighlyredundant
AT gunawardenajeremy transcriptionfactorscoregulatorsandepigeneticmarksarelinearlycorrelatedandhighlyredundant
AT eilsroland transcriptionfactorscoregulatorsandepigeneticmarksarelinearlycorrelatedandhighlyredundant