Cargando…

Large-scale epigenome imputation improves data quality and disease variant enrichment

With hundreds of epigenomic maps, the opportunity arises to exploit the correlated nature of epigenetic signals, across both marks and samples, for large-scale prediction of additional datasets. Here, we undertake epigenome imputation by leveraging such correlations through an ensemble of regression...

Descripción completa

Detalles Bibliográficos
Autores principales: Ernst, Jason, Kellis, Manolis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4512306/
https://www.ncbi.nlm.nih.gov/pubmed/25690853
http://dx.doi.org/10.1038/nbt.3157
_version_ 1782382473783541760
author Ernst, Jason
Kellis, Manolis
author_facet Ernst, Jason
Kellis, Manolis
author_sort Ernst, Jason
collection PubMed
description With hundreds of epigenomic maps, the opportunity arises to exploit the correlated nature of epigenetic signals, across both marks and samples, for large-scale prediction of additional datasets. Here, we undertake epigenome imputation by leveraging such correlations through an ensemble of regression trees. We impute 4,315 high-resolution signal maps, of which 26% are also experimentally observed. Imputed signal tracks show overall similarity to observed signals, and surpass experimental datasets in consistency, recovery of gene annotations, and enrichment for disease-associated variants. We use the imputed data to detect low quality experimental datasets, to find genomic sites with unexpected epigenomic signals, to define high-priority marks for new experiments, and to delineate chromatin states in 127 reference epigenomes spanning diverse tissues and cell types. Our imputed datasets provide the most comprehensive human regulatory annotation to date, and our approach and the ChromImpute software constitute a useful complement to large-scale experimental mapping of epigenomic information.
format Online
Article
Text
id pubmed-4512306
institution National Center for Biotechnology Information
language English
publishDate 2015
record_format MEDLINE/PubMed
spelling pubmed-45123062015-10-01 Large-scale epigenome imputation improves data quality and disease variant enrichment Ernst, Jason Kellis, Manolis Nat Biotechnol Article With hundreds of epigenomic maps, the opportunity arises to exploit the correlated nature of epigenetic signals, across both marks and samples, for large-scale prediction of additional datasets. Here, we undertake epigenome imputation by leveraging such correlations through an ensemble of regression trees. We impute 4,315 high-resolution signal maps, of which 26% are also experimentally observed. Imputed signal tracks show overall similarity to observed signals, and surpass experimental datasets in consistency, recovery of gene annotations, and enrichment for disease-associated variants. We use the imputed data to detect low quality experimental datasets, to find genomic sites with unexpected epigenomic signals, to define high-priority marks for new experiments, and to delineate chromatin states in 127 reference epigenomes spanning diverse tissues and cell types. Our imputed datasets provide the most comprehensive human regulatory annotation to date, and our approach and the ChromImpute software constitute a useful complement to large-scale experimental mapping of epigenomic information. 2015-02-18 2015-04 /pmc/articles/PMC4512306/ /pubmed/25690853 http://dx.doi.org/10.1038/nbt.3157 Text en http://www.nature.com/authors/editorial_policies/license.html#terms Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use:http://www.nature.com/authors/editorial_policies/license.html#terms
spellingShingle Article
Ernst, Jason
Kellis, Manolis
Large-scale epigenome imputation improves data quality and disease variant enrichment
title Large-scale epigenome imputation improves data quality and disease variant enrichment
title_full Large-scale epigenome imputation improves data quality and disease variant enrichment
title_fullStr Large-scale epigenome imputation improves data quality and disease variant enrichment
title_full_unstemmed Large-scale epigenome imputation improves data quality and disease variant enrichment
title_short Large-scale epigenome imputation improves data quality and disease variant enrichment
title_sort large-scale epigenome imputation improves data quality and disease variant enrichment
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4512306/
https://www.ncbi.nlm.nih.gov/pubmed/25690853
http://dx.doi.org/10.1038/nbt.3157
work_keys_str_mv AT ernstjason largescaleepigenomeimputationimprovesdataqualityanddiseasevariantenrichment
AT kellismanolis largescaleepigenomeimputationimprovesdataqualityanddiseasevariantenrichment