Cargando…
Uncovering correlated variability in epigenomic datasets using the Karhunen-Loeve transform
BACKGROUND: Larger variation exists in epigenomes than in genomes, as a single genome shapes the identity of multiple cell types. With the advent of next-generation sequencing, one of the key problems in computational epigenomics is the poor understanding of correlations and quantitative differences...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4488123/ https://www.ncbi.nlm.nih.gov/pubmed/26140054 http://dx.doi.org/10.1186/s13040-015-0051-7 |
_version_ | 1782379100673933312 |
---|---|
author | Madrigal, Pedro Krajewski, Paweł |
author_facet | Madrigal, Pedro Krajewski, Paweł |
author_sort | Madrigal, Pedro |
collection | PubMed |
description | BACKGROUND: Larger variation exists in epigenomes than in genomes, as a single genome shapes the identity of multiple cell types. With the advent of next-generation sequencing, one of the key problems in computational epigenomics is the poor understanding of correlations and quantitative differences between large scale data sets. RESULTS: Here we bring to genomics a scenario of functional principal component analysis, a finite Karhunen-Loève transform, and explicitly decompose the variation in the coverage profiles of 27 chromatin mark ChIP-seq datasets at transcription start sites for H1, one of the most used human embryonic stem cell lines. Using this approach we identify positive correlations between H3K4me3 and H3K36me3, as well as between H3K9ac and H3K36me3, so far undetected by the most commonly used Pearson correlation between read enrichment coverages. We uncover highly negative correlations between H2A.Z, H3K4me3, and several histone acetylation marks, but these occur only between principal components of first and second order. We also demonstrate that levels of gene expression correlate significantly with scores of components of order higher than one, demonstrating that transcriptional regulation by histone marks escapes simple one-to-one relationships. This correlations were higher in significance and magnitude in protein coding genes than in non-coding RNAs. CONCLUSIONS: In summary, we present a methodology to explore and uncover novel patterns of epigenomic variability and covariability in genomic data sets by using a functional eigenvalue decomposition of genomic data. R code is available at: http://github.com/pmb59/KLTepigenome. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13040-015-0051-7) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4488123 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-44881232015-07-03 Uncovering correlated variability in epigenomic datasets using the Karhunen-Loeve transform Madrigal, Pedro Krajewski, Paweł BioData Min Methodology BACKGROUND: Larger variation exists in epigenomes than in genomes, as a single genome shapes the identity of multiple cell types. With the advent of next-generation sequencing, one of the key problems in computational epigenomics is the poor understanding of correlations and quantitative differences between large scale data sets. RESULTS: Here we bring to genomics a scenario of functional principal component analysis, a finite Karhunen-Loève transform, and explicitly decompose the variation in the coverage profiles of 27 chromatin mark ChIP-seq datasets at transcription start sites for H1, one of the most used human embryonic stem cell lines. Using this approach we identify positive correlations between H3K4me3 and H3K36me3, as well as between H3K9ac and H3K36me3, so far undetected by the most commonly used Pearson correlation between read enrichment coverages. We uncover highly negative correlations between H2A.Z, H3K4me3, and several histone acetylation marks, but these occur only between principal components of first and second order. We also demonstrate that levels of gene expression correlate significantly with scores of components of order higher than one, demonstrating that transcriptional regulation by histone marks escapes simple one-to-one relationships. This correlations were higher in significance and magnitude in protein coding genes than in non-coding RNAs. CONCLUSIONS: In summary, we present a methodology to explore and uncover novel patterns of epigenomic variability and covariability in genomic data sets by using a functional eigenvalue decomposition of genomic data. R code is available at: http://github.com/pmb59/KLTepigenome. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13040-015-0051-7) contains supplementary material, which is available to authorized users. BioMed Central 2015-07-01 /pmc/articles/PMC4488123/ /pubmed/26140054 http://dx.doi.org/10.1186/s13040-015-0051-7 Text en © Madrigal and Krajewski. 2015 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Madrigal, Pedro Krajewski, Paweł Uncovering correlated variability in epigenomic datasets using the Karhunen-Loeve transform |
title | Uncovering correlated variability in epigenomic datasets using the Karhunen-Loeve transform |
title_full | Uncovering correlated variability in epigenomic datasets using the Karhunen-Loeve transform |
title_fullStr | Uncovering correlated variability in epigenomic datasets using the Karhunen-Loeve transform |
title_full_unstemmed | Uncovering correlated variability in epigenomic datasets using the Karhunen-Loeve transform |
title_short | Uncovering correlated variability in epigenomic datasets using the Karhunen-Loeve transform |
title_sort | uncovering correlated variability in epigenomic datasets using the karhunen-loeve transform |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4488123/ https://www.ncbi.nlm.nih.gov/pubmed/26140054 http://dx.doi.org/10.1186/s13040-015-0051-7 |
work_keys_str_mv | AT madrigalpedro uncoveringcorrelatedvariabilityinepigenomicdatasetsusingthekarhunenloevetransform AT krajewskipaweł uncoveringcorrelatedvariabilityinepigenomicdatasetsusingthekarhunenloevetransform |