Cargando…
Combinatorial epigenetic patterns as quantitative predictors of chromatin biology
BACKGROUND: Chromatin immunoprecipitation followed by deep sequencing (ChIP-seq) is the most widely used method for characterizing the epigenetic states of chromatin on a genomic scale. With the recent availability of large genome-wide data sets, often comprising several epigenetic marks, novel appr...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3922690/ https://www.ncbi.nlm.nih.gov/pubmed/24472558 http://dx.doi.org/10.1186/1471-2164-15-76 |
_version_ | 1782303487577554944 |
---|---|
author | Cieślik, Marcin Bekiranov, Stefan |
author_facet | Cieślik, Marcin Bekiranov, Stefan |
author_sort | Cieślik, Marcin |
collection | PubMed |
description | BACKGROUND: Chromatin immunoprecipitation followed by deep sequencing (ChIP-seq) is the most widely used method for characterizing the epigenetic states of chromatin on a genomic scale. With the recent availability of large genome-wide data sets, often comprising several epigenetic marks, novel approaches are required to explore functionally relevant interactions between histone modifications. Computational discovery of "chromatin states" defined by such combinatorial interactions enabled descriptive annotations of genomes, but more quantitative approaches are needed to progress towards predictive models. RESULTS: We propose non-negative matrix factorization (NMF) as a new unsupervised method to discover combinatorial patterns of epigenetic marks that frequently co-occur in subsets of genomic regions. We show that this small set of combinatorial "codes" can be effectively displayed and interpreted. NMF codes enable dimensionality reduction and have desirable statistical properties for regression and classification tasks. We demonstrate the utility of codes in the quantitative prediction of Pol2-binding and the discrimination between Pol2-bound promoters and enhancers. Finally, we show that specific codes can be linked to molecular pathways and targets of pluripotency genes during differentiation. CONCLUSIONS: We have introduced and evaluated a new computational approach to represent combinatorial patterns of epigenetic marks as quantitative variables suitable for predictive modeling and supervised machine learning. To foster widespread adoption of this method we make it available as an open-source software-package – epicode at https://github.com/mcieslik-mctp/epicode. |
format | Online Article Text |
id | pubmed-3922690 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-39226902014-02-27 Combinatorial epigenetic patterns as quantitative predictors of chromatin biology Cieślik, Marcin Bekiranov, Stefan BMC Genomics Methodology Article BACKGROUND: Chromatin immunoprecipitation followed by deep sequencing (ChIP-seq) is the most widely used method for characterizing the epigenetic states of chromatin on a genomic scale. With the recent availability of large genome-wide data sets, often comprising several epigenetic marks, novel approaches are required to explore functionally relevant interactions between histone modifications. Computational discovery of "chromatin states" defined by such combinatorial interactions enabled descriptive annotations of genomes, but more quantitative approaches are needed to progress towards predictive models. RESULTS: We propose non-negative matrix factorization (NMF) as a new unsupervised method to discover combinatorial patterns of epigenetic marks that frequently co-occur in subsets of genomic regions. We show that this small set of combinatorial "codes" can be effectively displayed and interpreted. NMF codes enable dimensionality reduction and have desirable statistical properties for regression and classification tasks. We demonstrate the utility of codes in the quantitative prediction of Pol2-binding and the discrimination between Pol2-bound promoters and enhancers. Finally, we show that specific codes can be linked to molecular pathways and targets of pluripotency genes during differentiation. CONCLUSIONS: We have introduced and evaluated a new computational approach to represent combinatorial patterns of epigenetic marks as quantitative variables suitable for predictive modeling and supervised machine learning. To foster widespread adoption of this method we make it available as an open-source software-package – epicode at https://github.com/mcieslik-mctp/epicode. BioMed Central 2014-01-28 /pmc/articles/PMC3922690/ /pubmed/24472558 http://dx.doi.org/10.1186/1471-2164-15-76 Text en Copyright © 2014 Cieślik and Bekiranov; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Cieślik, Marcin Bekiranov, Stefan Combinatorial epigenetic patterns as quantitative predictors of chromatin biology |
title | Combinatorial epigenetic patterns as quantitative predictors of chromatin biology |
title_full | Combinatorial epigenetic patterns as quantitative predictors of chromatin biology |
title_fullStr | Combinatorial epigenetic patterns as quantitative predictors of chromatin biology |
title_full_unstemmed | Combinatorial epigenetic patterns as quantitative predictors of chromatin biology |
title_short | Combinatorial epigenetic patterns as quantitative predictors of chromatin biology |
title_sort | combinatorial epigenetic patterns as quantitative predictors of chromatin biology |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3922690/ https://www.ncbi.nlm.nih.gov/pubmed/24472558 http://dx.doi.org/10.1186/1471-2164-15-76 |
work_keys_str_mv | AT cieslikmarcin combinatorialepigeneticpatternsasquantitativepredictorsofchromatinbiology AT bekiranovstefan combinatorialepigeneticpatternsasquantitativepredictorsofchromatinbiology |