Cargando…

A universal framework for detecting cis-regulatory diversity in DNA regions

High-throughput sequencing-based assays measure different biochemical activities pertaining to gene regulation, genome-wide. These activities include transcription factor (TF)–DNA binding, enhancer activity, open chromatin, and more. A major goal is to understand underlying sequence components, or m...

Descripción completa

Detalles Bibliográficos
Autores principales: Biswas, Anushua, Narlikar, Leelavati
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8415372/
https://www.ncbi.nlm.nih.gov/pubmed/34285090
http://dx.doi.org/10.1101/gr.274563.120
_version_ 1783747956039483392
author Biswas, Anushua
Narlikar, Leelavati
author_facet Biswas, Anushua
Narlikar, Leelavati
author_sort Biswas, Anushua
collection PubMed
description High-throughput sequencing-based assays measure different biochemical activities pertaining to gene regulation, genome-wide. These activities include transcription factor (TF)–DNA binding, enhancer activity, open chromatin, and more. A major goal is to understand underlying sequence components, or motifs, that can explain the measured activity. It is usually not one motif but a combination of motifs bound by cooperatively acting proteins that confers activity to such regions. Furthermore, regions can be diverse, governed by different combinations of TFs/motifs. Current approaches do not take into account this issue of combinatorial diversity. We present a new statistical framework, cisDIVERSITY, which models regions as diverse modules characterized by combinations of motifs while simultaneously learning the motifs themselves. Because cisDIVERSITY does not rely on knowledge of motifs, modules, cell type, or organism, it is general enough to be applied to regions reported by most high-throughput assays. For example, in enhancer predictions resulting from different assays—GRO-cap, STARR-seq, and those measuring chromatin structure—cisDIVERSITY discovers distinct modules and combinations of TF binding sites, some specific to the assay. From protein–DNA binding data, cisDIVERSITY identifies potential cofactors of the profiled TF, whereas from ATAC-seq data, it identifies tissue-specific regulatory modules. Finally, analysis of single-cell ATAC-seq data suggests that regions open in one cell-state encode information about future states, with certain modules staying open and others closing down in the next time point.
format Online
Article
Text
id pubmed-8415372
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-84153722022-03-01 A universal framework for detecting cis-regulatory diversity in DNA regions Biswas, Anushua Narlikar, Leelavati Genome Res Method High-throughput sequencing-based assays measure different biochemical activities pertaining to gene regulation, genome-wide. These activities include transcription factor (TF)–DNA binding, enhancer activity, open chromatin, and more. A major goal is to understand underlying sequence components, or motifs, that can explain the measured activity. It is usually not one motif but a combination of motifs bound by cooperatively acting proteins that confers activity to such regions. Furthermore, regions can be diverse, governed by different combinations of TFs/motifs. Current approaches do not take into account this issue of combinatorial diversity. We present a new statistical framework, cisDIVERSITY, which models regions as diverse modules characterized by combinations of motifs while simultaneously learning the motifs themselves. Because cisDIVERSITY does not rely on knowledge of motifs, modules, cell type, or organism, it is general enough to be applied to regions reported by most high-throughput assays. For example, in enhancer predictions resulting from different assays—GRO-cap, STARR-seq, and those measuring chromatin structure—cisDIVERSITY discovers distinct modules and combinations of TF binding sites, some specific to the assay. From protein–DNA binding data, cisDIVERSITY identifies potential cofactors of the profiled TF, whereas from ATAC-seq data, it identifies tissue-specific regulatory modules. Finally, analysis of single-cell ATAC-seq data suggests that regions open in one cell-state encode information about future states, with certain modules staying open and others closing down in the next time point. Cold Spring Harbor Laboratory Press 2021-09 /pmc/articles/PMC8415372/ /pubmed/34285090 http://dx.doi.org/10.1101/gr.274563.120 Text en © 2021 Biswas and Narlikar; Published by Cold Spring Harbor Laboratory Press https://creativecommons.org/licenses/by-nc/4.0/This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) .
spellingShingle Method
Biswas, Anushua
Narlikar, Leelavati
A universal framework for detecting cis-regulatory diversity in DNA regions
title A universal framework for detecting cis-regulatory diversity in DNA regions
title_full A universal framework for detecting cis-regulatory diversity in DNA regions
title_fullStr A universal framework for detecting cis-regulatory diversity in DNA regions
title_full_unstemmed A universal framework for detecting cis-regulatory diversity in DNA regions
title_short A universal framework for detecting cis-regulatory diversity in DNA regions
title_sort universal framework for detecting cis-regulatory diversity in dna regions
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8415372/
https://www.ncbi.nlm.nih.gov/pubmed/34285090
http://dx.doi.org/10.1101/gr.274563.120
work_keys_str_mv AT biswasanushua auniversalframeworkfordetectingcisregulatorydiversityindnaregions
AT narlikarleelavati auniversalframeworkfordetectingcisregulatorydiversityindnaregions
AT biswasanushua universalframeworkfordetectingcisregulatorydiversityindnaregions
AT narlikarleelavati universalframeworkfordetectingcisregulatorydiversityindnaregions