Cargando…

Sequence characteristics and an accurate model of high-occupancy target loci in the human genome

Enhancers and promoters are considered to be bound by a small set of TFs in a sequence-specific manner. This assumption has come under increasing skepticism as the datasets of ChIP-seq assays have expanded. Particularly, high-occupancy target (HOT) loci attract hundreds of TFs with seemingly no dete...

Descripción completa

Detalles Bibliográficos
Autores principales: Hudaiberdiev, Sanjarbek, Ovcharenko, Ivan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10028745/
https://www.ncbi.nlm.nih.gov/pubmed/36945558
http://dx.doi.org/10.1101/2023.02.05.527203
_version_ 1784910011335442432
author Hudaiberdiev, Sanjarbek
Ovcharenko, Ivan
author_facet Hudaiberdiev, Sanjarbek
Ovcharenko, Ivan
author_sort Hudaiberdiev, Sanjarbek
collection PubMed
description Enhancers and promoters are considered to be bound by a small set of TFs in a sequence-specific manner. This assumption has come under increasing skepticism as the datasets of ChIP-seq assays have expanded. Particularly, high-occupancy target (HOT) loci attract hundreds of TFs with seemingly no detectable correlation between ChIP-seq peaks and DNA-binding motif presence. Here, we used 1,003 TF ChIP-seq datasets in HepG2, K562, and H1 cells to analyze the patterns of ChIP-seq peak co-occurrence combined with functional genomics datasets. We identified 43,891 HOT loci forming at the promoter (53%) and enhancer (47%) regions and determined that HOT promoters regulate housekeeping genes, whereas the HOT enhancers are involved in extremely tissue-specific processes. HOT loci form the foundation of human super-enhancers and evolve under strong negative selection, with some of them being ultraconserved regions. Sequence-based classification of HOT loci using deep learning suggests that their formation is driven by sequence features, and the density of ChIP-seq peaks correlates with sequence features. Based on their affinities to bind to promoters and enhancers, we detected five distinct clusters of TFs that form the core of the HOT loci. We also observed that HOT loci are enriched in 3D chromatin hubs and disease-causal variants. In a challenge to the classical model of enhancer activity, we report an abundance of HOT loci in human genome and a commitment of 51% of all ChIP-seq binding events to HOT locus formation and propose a model of HOT locus formation based on the existence of large transcriptional condensates.
format Online
Article
Text
id pubmed-10028745
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-100287452023-03-22 Sequence characteristics and an accurate model of high-occupancy target loci in the human genome Hudaiberdiev, Sanjarbek Ovcharenko, Ivan bioRxiv Article Enhancers and promoters are considered to be bound by a small set of TFs in a sequence-specific manner. This assumption has come under increasing skepticism as the datasets of ChIP-seq assays have expanded. Particularly, high-occupancy target (HOT) loci attract hundreds of TFs with seemingly no detectable correlation between ChIP-seq peaks and DNA-binding motif presence. Here, we used 1,003 TF ChIP-seq datasets in HepG2, K562, and H1 cells to analyze the patterns of ChIP-seq peak co-occurrence combined with functional genomics datasets. We identified 43,891 HOT loci forming at the promoter (53%) and enhancer (47%) regions and determined that HOT promoters regulate housekeeping genes, whereas the HOT enhancers are involved in extremely tissue-specific processes. HOT loci form the foundation of human super-enhancers and evolve under strong negative selection, with some of them being ultraconserved regions. Sequence-based classification of HOT loci using deep learning suggests that their formation is driven by sequence features, and the density of ChIP-seq peaks correlates with sequence features. Based on their affinities to bind to promoters and enhancers, we detected five distinct clusters of TFs that form the core of the HOT loci. We also observed that HOT loci are enriched in 3D chromatin hubs and disease-causal variants. In a challenge to the classical model of enhancer activity, we report an abundance of HOT loci in human genome and a commitment of 51% of all ChIP-seq binding events to HOT locus formation and propose a model of HOT locus formation based on the existence of large transcriptional condensates. Cold Spring Harbor Laboratory 2023-02-05 /pmc/articles/PMC10028745/ /pubmed/36945558 http://dx.doi.org/10.1101/2023.02.05.527203 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Hudaiberdiev, Sanjarbek
Ovcharenko, Ivan
Sequence characteristics and an accurate model of high-occupancy target loci in the human genome
title Sequence characteristics and an accurate model of high-occupancy target loci in the human genome
title_full Sequence characteristics and an accurate model of high-occupancy target loci in the human genome
title_fullStr Sequence characteristics and an accurate model of high-occupancy target loci in the human genome
title_full_unstemmed Sequence characteristics and an accurate model of high-occupancy target loci in the human genome
title_short Sequence characteristics and an accurate model of high-occupancy target loci in the human genome
title_sort sequence characteristics and an accurate model of high-occupancy target loci in the human genome
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10028745/
https://www.ncbi.nlm.nih.gov/pubmed/36945558
http://dx.doi.org/10.1101/2023.02.05.527203
work_keys_str_mv AT hudaiberdievsanjarbek sequencecharacteristicsandanaccuratemodelofhighoccupancytargetlociinthehumangenome
AT ovcharenkoivan sequencecharacteristicsandanaccuratemodelofhighoccupancytargetlociinthehumangenome