Cargando…
Predicting unrecognized enhancer-mediated genome topology by an ensemble machine learning model
Transcriptional enhancers commonly work over long genomic distances to precisely regulate spatiotemporal gene expression patterns. Dissecting the promoters physically contacted by these distal regulatory elements is essential for understanding developmental processes as well as the role of disease-a...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7706734/ https://www.ncbi.nlm.nih.gov/pubmed/33184104 http://dx.doi.org/10.1101/gr.264606.120 |
_version_ | 1783617211387084800 |
---|---|
author | Tang, Li Hill, Matthew C. Wang, Jun Wang, Jianxin Martin, James F. Li, Min |
author_facet | Tang, Li Hill, Matthew C. Wang, Jun Wang, Jianxin Martin, James F. Li, Min |
author_sort | Tang, Li |
collection | PubMed |
description | Transcriptional enhancers commonly work over long genomic distances to precisely regulate spatiotemporal gene expression patterns. Dissecting the promoters physically contacted by these distal regulatory elements is essential for understanding developmental processes as well as the role of disease-associated risk variants. Modern proximity-ligation assays, like HiChIP and ChIA-PET, facilitate the accurate identification of long-range contacts between enhancers and promoters. However, these assays are technically challenging, expensive, and time-consuming, making it difficult to investigate enhancer topologies, especially in uncharacterized cell types. To overcome these shortcomings, we therefore designed LoopPredictor, an ensemble machine learning model, to predict genome topology for cell types which lack long-range contact maps. To enrich for functional enhancer-promoter loops over common structural genomic contacts, we trained LoopPredictor with both H3K27ac and YY1 HiChIP data. Moreover, the integration of several related multi-omics features facilitated identifying and annotating the predicted loops. LoopPredictor is able to efficiently identify cell type–specific enhancer-mediated loops, and promoter–promoter interactions, with a modest feature input requirement. Comparable to experimentally generated H3K27ac HiChIP data, we found that LoopPredictor was able to identify functional enhancer loops. Furthermore, to explore the cross-species prediction capability of LoopPredictor, we fed mouse multi-omics features into a model trained on human data and found that the predicted enhancer loops outputs were highly conserved. LoopPredictor enables the dissection of cell type–specific long-range gene regulation and can accelerate the identification of distal disease-associated risk variants. |
format | Online Article Text |
id | pubmed-7706734 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Cold Spring Harbor Laboratory Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-77067342021-06-01 Predicting unrecognized enhancer-mediated genome topology by an ensemble machine learning model Tang, Li Hill, Matthew C. Wang, Jun Wang, Jianxin Martin, James F. Li, Min Genome Res Method Transcriptional enhancers commonly work over long genomic distances to precisely regulate spatiotemporal gene expression patterns. Dissecting the promoters physically contacted by these distal regulatory elements is essential for understanding developmental processes as well as the role of disease-associated risk variants. Modern proximity-ligation assays, like HiChIP and ChIA-PET, facilitate the accurate identification of long-range contacts between enhancers and promoters. However, these assays are technically challenging, expensive, and time-consuming, making it difficult to investigate enhancer topologies, especially in uncharacterized cell types. To overcome these shortcomings, we therefore designed LoopPredictor, an ensemble machine learning model, to predict genome topology for cell types which lack long-range contact maps. To enrich for functional enhancer-promoter loops over common structural genomic contacts, we trained LoopPredictor with both H3K27ac and YY1 HiChIP data. Moreover, the integration of several related multi-omics features facilitated identifying and annotating the predicted loops. LoopPredictor is able to efficiently identify cell type–specific enhancer-mediated loops, and promoter–promoter interactions, with a modest feature input requirement. Comparable to experimentally generated H3K27ac HiChIP data, we found that LoopPredictor was able to identify functional enhancer loops. Furthermore, to explore the cross-species prediction capability of LoopPredictor, we fed mouse multi-omics features into a model trained on human data and found that the predicted enhancer loops outputs were highly conserved. LoopPredictor enables the dissection of cell type–specific long-range gene regulation and can accelerate the identification of distal disease-associated risk variants. Cold Spring Harbor Laboratory Press 2020-12 /pmc/articles/PMC7706734/ /pubmed/33184104 http://dx.doi.org/10.1101/gr.264606.120 Text en © 2020 Tang et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/4.0/ This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/. |
spellingShingle | Method Tang, Li Hill, Matthew C. Wang, Jun Wang, Jianxin Martin, James F. Li, Min Predicting unrecognized enhancer-mediated genome topology by an ensemble machine learning model |
title | Predicting unrecognized enhancer-mediated genome topology by an ensemble machine learning model |
title_full | Predicting unrecognized enhancer-mediated genome topology by an ensemble machine learning model |
title_fullStr | Predicting unrecognized enhancer-mediated genome topology by an ensemble machine learning model |
title_full_unstemmed | Predicting unrecognized enhancer-mediated genome topology by an ensemble machine learning model |
title_short | Predicting unrecognized enhancer-mediated genome topology by an ensemble machine learning model |
title_sort | predicting unrecognized enhancer-mediated genome topology by an ensemble machine learning model |
topic | Method |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7706734/ https://www.ncbi.nlm.nih.gov/pubmed/33184104 http://dx.doi.org/10.1101/gr.264606.120 |
work_keys_str_mv | AT tangli predictingunrecognizedenhancermediatedgenometopologybyanensemblemachinelearningmodel AT hillmatthewc predictingunrecognizedenhancermediatedgenometopologybyanensemblemachinelearningmodel AT wangjun predictingunrecognizedenhancermediatedgenometopologybyanensemblemachinelearningmodel AT wangjianxin predictingunrecognizedenhancermediatedgenometopologybyanensemblemachinelearningmodel AT martinjamesf predictingunrecognizedenhancermediatedgenometopologybyanensemblemachinelearningmodel AT limin predictingunrecognizedenhancermediatedgenometopologybyanensemblemachinelearningmodel |