Cargando…

Computational inference of H3K4me3 and H3K27ac domain length

Background. Recent epigenomic studies have shown that the length of a DNA region covered by an epigenetic mark is not just a byproduct of the assaying technologies and has functional implications for that locus. For example, expanded regions of DNA sequences that are marked by enhancer-specific hist...

Descripción completa

Detalles Bibliográficos
Autores principales: Zubek, Julian, Stitzel, Michael L., Ucar, Duygu, Plewczynski, Dariusz M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4793332/
https://www.ncbi.nlm.nih.gov/pubmed/26989607
http://dx.doi.org/10.7717/peerj.1750
_version_ 1782421386930683904
author Zubek, Julian
Stitzel, Michael L.
Ucar, Duygu
Plewczynski, Dariusz M.
author_facet Zubek, Julian
Stitzel, Michael L.
Ucar, Duygu
Plewczynski, Dariusz M.
author_sort Zubek, Julian
collection PubMed
description Background. Recent epigenomic studies have shown that the length of a DNA region covered by an epigenetic mark is not just a byproduct of the assaying technologies and has functional implications for that locus. For example, expanded regions of DNA sequences that are marked by enhancer-specific histone modifications, such as acetylation of histone H3 lysine 27 (H3K27ac) domains coincide with cell-specific enhancers, known as super or stretch enhancers. Similarly, promoters of genes critical for cell-specific functions are marked by expanded H3K4me3 domains in the cognate cell type, and these can span DNA regions from 4–5kb up to 40–50kb in length. These expanded H3K4me3 domains are known as buffer domains or super promoters. Methods. To ask what correlates with—and potentially regulates—the length of loci marked with these two important histone marks, H3K4me3 and H3K27ac, we built Random Forest regression models. With these models, we computationally identified genomic and epigenomic patterns that are predictive for the length of these marks in seven ENCODE cell lines. Results. We found that certain epigenetic marks and transcription factors explain the variability of the length of H3K4me3 and H3K27ac marks across different cell types, which implies that the lengths of these two epigenetic marks are tightly regulated in a given cell type. Our source code for the regression models and data can be found at our GitHub page: https://github.com/zubekj/broad_peaks. Discussion. Our Random Forest based regression models enabled us to estimate the individual contribution of different epigenetic marks and protein binding patterns to the length of H3K4me3 and H3K27ac deposition patterns, therefore potentially revealing genomic signatures at cell specific regulatory elements.
format Online
Article
Text
id pubmed-4793332
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-47933322016-03-17 Computational inference of H3K4me3 and H3K27ac domain length Zubek, Julian Stitzel, Michael L. Ucar, Duygu Plewczynski, Dariusz M. PeerJ Bioinformatics Background. Recent epigenomic studies have shown that the length of a DNA region covered by an epigenetic mark is not just a byproduct of the assaying technologies and has functional implications for that locus. For example, expanded regions of DNA sequences that are marked by enhancer-specific histone modifications, such as acetylation of histone H3 lysine 27 (H3K27ac) domains coincide with cell-specific enhancers, known as super or stretch enhancers. Similarly, promoters of genes critical for cell-specific functions are marked by expanded H3K4me3 domains in the cognate cell type, and these can span DNA regions from 4–5kb up to 40–50kb in length. These expanded H3K4me3 domains are known as buffer domains or super promoters. Methods. To ask what correlates with—and potentially regulates—the length of loci marked with these two important histone marks, H3K4me3 and H3K27ac, we built Random Forest regression models. With these models, we computationally identified genomic and epigenomic patterns that are predictive for the length of these marks in seven ENCODE cell lines. Results. We found that certain epigenetic marks and transcription factors explain the variability of the length of H3K4me3 and H3K27ac marks across different cell types, which implies that the lengths of these two epigenetic marks are tightly regulated in a given cell type. Our source code for the regression models and data can be found at our GitHub page: https://github.com/zubekj/broad_peaks. Discussion. Our Random Forest based regression models enabled us to estimate the individual contribution of different epigenetic marks and protein binding patterns to the length of H3K4me3 and H3K27ac deposition patterns, therefore potentially revealing genomic signatures at cell specific regulatory elements. PeerJ Inc. 2016-03-14 /pmc/articles/PMC4793332/ /pubmed/26989607 http://dx.doi.org/10.7717/peerj.1750 Text en ©2016 Zubek et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Zubek, Julian
Stitzel, Michael L.
Ucar, Duygu
Plewczynski, Dariusz M.
Computational inference of H3K4me3 and H3K27ac domain length
title Computational inference of H3K4me3 and H3K27ac domain length
title_full Computational inference of H3K4me3 and H3K27ac domain length
title_fullStr Computational inference of H3K4me3 and H3K27ac domain length
title_full_unstemmed Computational inference of H3K4me3 and H3K27ac domain length
title_short Computational inference of H3K4me3 and H3K27ac domain length
title_sort computational inference of h3k4me3 and h3k27ac domain length
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4793332/
https://www.ncbi.nlm.nih.gov/pubmed/26989607
http://dx.doi.org/10.7717/peerj.1750
work_keys_str_mv AT zubekjulian computationalinferenceofh3k4me3andh3k27acdomainlength
AT stitzelmichaell computationalinferenceofh3k4me3andh3k27acdomainlength
AT ucarduygu computationalinferenceofh3k4me3andh3k27acdomainlength
AT plewczynskidariuszm computationalinferenceofh3k4me3andh3k27acdomainlength