Cargando…

Prediction of CTCF loop anchor based on machine learning

Introduction: Various activities in biological cells are affected by three-dimensional genome structure. The insulators play an important role in the organization of higher-order structure. CTCF is a representative of mammalian insulators, which can produce barriers to prevent the continuous extrusi...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Xiao, Zhu, Wen, Sun, Huimin, Ding, Yijie, Liu, Li
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10106609/
https://www.ncbi.nlm.nih.gov/pubmed/37077544
http://dx.doi.org/10.3389/fgene.2023.1181956
_version_ 1785026440068071424
author Zhang, Xiao
Zhu, Wen
Sun, Huimin
Ding, Yijie
Liu, Li
author_facet Zhang, Xiao
Zhu, Wen
Sun, Huimin
Ding, Yijie
Liu, Li
author_sort Zhang, Xiao
collection PubMed
description Introduction: Various activities in biological cells are affected by three-dimensional genome structure. The insulators play an important role in the organization of higher-order structure. CTCF is a representative of mammalian insulators, which can produce barriers to prevent the continuous extrusion of chromatin loop. As a multifunctional protein, CTCF has tens of thousands of binding sites in the genome, but only a portion of them can be used as anchors of chromatin loops. It is still unclear how cells select the anchor in the process of chromatin looping. Methods: In this paper, a comparative analysis is performed to investigate the sequence preference and binding strength of anchor and non-anchor CTCF binding sites. Furthermore, a machine learning model based on the CTCF binding intensity and DNA sequence is proposed to predict which CTCF sites can form chromatin loop anchors. Results: The accuracy of the machine learning model that we constructed for predicting the anchor of the chromatin loop mediated by CTCF reached 0.8646. And we find that the formation of loop anchor is mainly influenced by the CTCF binding strength and binding pattern (which can be interpreted as the binding of different zinc fingers). Discussion: In conclusion, our results suggest that The CTCF core motif and it’s flanking sequence may be responsible for the binding specificity. This work contributes to understanding the mechanism of loop anchor selection and provides a reference for the prediction of CTCF-mediated chromatin loops.
format Online
Article
Text
id pubmed-10106609
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-101066092023-04-18 Prediction of CTCF loop anchor based on machine learning Zhang, Xiao Zhu, Wen Sun, Huimin Ding, Yijie Liu, Li Front Genet Genetics Introduction: Various activities in biological cells are affected by three-dimensional genome structure. The insulators play an important role in the organization of higher-order structure. CTCF is a representative of mammalian insulators, which can produce barriers to prevent the continuous extrusion of chromatin loop. As a multifunctional protein, CTCF has tens of thousands of binding sites in the genome, but only a portion of them can be used as anchors of chromatin loops. It is still unclear how cells select the anchor in the process of chromatin looping. Methods: In this paper, a comparative analysis is performed to investigate the sequence preference and binding strength of anchor and non-anchor CTCF binding sites. Furthermore, a machine learning model based on the CTCF binding intensity and DNA sequence is proposed to predict which CTCF sites can form chromatin loop anchors. Results: The accuracy of the machine learning model that we constructed for predicting the anchor of the chromatin loop mediated by CTCF reached 0.8646. And we find that the formation of loop anchor is mainly influenced by the CTCF binding strength and binding pattern (which can be interpreted as the binding of different zinc fingers). Discussion: In conclusion, our results suggest that The CTCF core motif and it’s flanking sequence may be responsible for the binding specificity. This work contributes to understanding the mechanism of loop anchor selection and provides a reference for the prediction of CTCF-mediated chromatin loops. Frontiers Media S.A. 2023-04-03 /pmc/articles/PMC10106609/ /pubmed/37077544 http://dx.doi.org/10.3389/fgene.2023.1181956 Text en Copyright © 2023 Zhang, Zhu, Sun, Ding and Liu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Zhang, Xiao
Zhu, Wen
Sun, Huimin
Ding, Yijie
Liu, Li
Prediction of CTCF loop anchor based on machine learning
title Prediction of CTCF loop anchor based on machine learning
title_full Prediction of CTCF loop anchor based on machine learning
title_fullStr Prediction of CTCF loop anchor based on machine learning
title_full_unstemmed Prediction of CTCF loop anchor based on machine learning
title_short Prediction of CTCF loop anchor based on machine learning
title_sort prediction of ctcf loop anchor based on machine learning
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10106609/
https://www.ncbi.nlm.nih.gov/pubmed/37077544
http://dx.doi.org/10.3389/fgene.2023.1181956
work_keys_str_mv AT zhangxiao predictionofctcfloopanchorbasedonmachinelearning
AT zhuwen predictionofctcfloopanchorbasedonmachinelearning
AT sunhuimin predictionofctcfloopanchorbasedonmachinelearning
AT dingyijie predictionofctcfloopanchorbasedonmachinelearning
AT liuli predictionofctcfloopanchorbasedonmachinelearning