Cargando…

On the dependent recognition of some long zinc finger proteins

The human genome contains about 800 C2H2 zinc finger proteins (ZFPs), and most of them are composed of long arrays of zinc fingers. Standard ZFP recognition model asserts longer finger arrays should recognize longer DNA-binding sites. However, recent experimental efforts to identify in vivo ZFP bind...

Descripción completa

Detalles Bibliográficos
Autores principales: Zuo, Zheng, Billings, Timothy, Walker, Michael, Petkov, Petko M, Fordyce, Polly M, Stormo, Gary D
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10287918/
https://www.ncbi.nlm.nih.gov/pubmed/36951113
http://dx.doi.org/10.1093/nar/gkad207
_version_ 1785061969806491648
author Zuo, Zheng
Billings, Timothy
Walker, Michael
Petkov, Petko M
Fordyce, Polly M
Stormo, Gary D
author_facet Zuo, Zheng
Billings, Timothy
Walker, Michael
Petkov, Petko M
Fordyce, Polly M
Stormo, Gary D
author_sort Zuo, Zheng
collection PubMed
description The human genome contains about 800 C2H2 zinc finger proteins (ZFPs), and most of them are composed of long arrays of zinc fingers. Standard ZFP recognition model asserts longer finger arrays should recognize longer DNA-binding sites. However, recent experimental efforts to identify in vivo ZFP binding sites contradict this assumption, with many exhibiting short motifs. Here we use ZFY, CTCF, ZIM3, and ZNF343 as examples to address three closely related questions: What are the reasons that impede current motif discovery methods? What are the functions of those seemingly unused fingers and how can we improve the motif discovery algorithms based on long ZFPs’ biophysical properties? Using ZFY, we employed a variety of methods and find evidence for ‘dependent recognition’ where downstream fingers can recognize some previously undiscovered motifs only in the presence of an intact core site. For CTCF, high-throughput measurements revealed its upstream specificity profile depends on the strength of its core. Moreover, the binding strength of the upstream site modulates CTCF’s sensitivity to different epigenetic modifications within the core, providing new insight into how the previously identified intellectual disability-causing and cancer-related mutant R567W disrupts upstream recognition and deregulates the epigenetic control by CTCF. Our results establish that, because of irregular motif structures, variable spacing and dependent recognition between sub-motifs, the specificities of long ZFPs are significantly underestimated, so we developed an algorithm, ModeMap, to infer the motifs and recognition models of ZIM3 and ZNF343, which facilitates high-confidence identification of specific binding sites, including repeats-derived elements. With revised concept, technique, and algorithm, we can discover the overlooked specificities and functions of those ‘extra’ fingers, and therefore decipher their broader roles in human biology and diseases.
format Online
Article
Text
id pubmed-10287918
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-102879182023-06-24 On the dependent recognition of some long zinc finger proteins Zuo, Zheng Billings, Timothy Walker, Michael Petkov, Petko M Fordyce, Polly M Stormo, Gary D Nucleic Acids Res Gene regulation, Chromatin and Epigenetics The human genome contains about 800 C2H2 zinc finger proteins (ZFPs), and most of them are composed of long arrays of zinc fingers. Standard ZFP recognition model asserts longer finger arrays should recognize longer DNA-binding sites. However, recent experimental efforts to identify in vivo ZFP binding sites contradict this assumption, with many exhibiting short motifs. Here we use ZFY, CTCF, ZIM3, and ZNF343 as examples to address three closely related questions: What are the reasons that impede current motif discovery methods? What are the functions of those seemingly unused fingers and how can we improve the motif discovery algorithms based on long ZFPs’ biophysical properties? Using ZFY, we employed a variety of methods and find evidence for ‘dependent recognition’ where downstream fingers can recognize some previously undiscovered motifs only in the presence of an intact core site. For CTCF, high-throughput measurements revealed its upstream specificity profile depends on the strength of its core. Moreover, the binding strength of the upstream site modulates CTCF’s sensitivity to different epigenetic modifications within the core, providing new insight into how the previously identified intellectual disability-causing and cancer-related mutant R567W disrupts upstream recognition and deregulates the epigenetic control by CTCF. Our results establish that, because of irregular motif structures, variable spacing and dependent recognition between sub-motifs, the specificities of long ZFPs are significantly underestimated, so we developed an algorithm, ModeMap, to infer the motifs and recognition models of ZIM3 and ZNF343, which facilitates high-confidence identification of specific binding sites, including repeats-derived elements. With revised concept, technique, and algorithm, we can discover the overlooked specificities and functions of those ‘extra’ fingers, and therefore decipher their broader roles in human biology and diseases. Oxford University Press 2023-03-23 /pmc/articles/PMC10287918/ /pubmed/36951113 http://dx.doi.org/10.1093/nar/gkad207 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Gene regulation, Chromatin and Epigenetics
Zuo, Zheng
Billings, Timothy
Walker, Michael
Petkov, Petko M
Fordyce, Polly M
Stormo, Gary D
On the dependent recognition of some long zinc finger proteins
title On the dependent recognition of some long zinc finger proteins
title_full On the dependent recognition of some long zinc finger proteins
title_fullStr On the dependent recognition of some long zinc finger proteins
title_full_unstemmed On the dependent recognition of some long zinc finger proteins
title_short On the dependent recognition of some long zinc finger proteins
title_sort on the dependent recognition of some long zinc finger proteins
topic Gene regulation, Chromatin and Epigenetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10287918/
https://www.ncbi.nlm.nih.gov/pubmed/36951113
http://dx.doi.org/10.1093/nar/gkad207
work_keys_str_mv AT zuozheng onthedependentrecognitionofsomelongzincfingerproteins
AT billingstimothy onthedependentrecognitionofsomelongzincfingerproteins
AT walkermichael onthedependentrecognitionofsomelongzincfingerproteins
AT petkovpetkom onthedependentrecognitionofsomelongzincfingerproteins
AT fordycepollym onthedependentrecognitionofsomelongzincfingerproteins
AT stormogaryd onthedependentrecognitionofsomelongzincfingerproteins