Cargando…
Prediction of CTCF loop anchor based on machine learning
Introduction: Various activities in biological cells are affected by three-dimensional genome structure. The insulators play an important role in the organization of higher-order structure. CTCF is a representative of mammalian insulators, which can produce barriers to prevent the continuous extrusi...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10106609/ https://www.ncbi.nlm.nih.gov/pubmed/37077544 http://dx.doi.org/10.3389/fgene.2023.1181956 |
_version_ | 1785026440068071424 |
---|---|
author | Zhang, Xiao Zhu, Wen Sun, Huimin Ding, Yijie Liu, Li |
author_facet | Zhang, Xiao Zhu, Wen Sun, Huimin Ding, Yijie Liu, Li |
author_sort | Zhang, Xiao |
collection | PubMed |
description | Introduction: Various activities in biological cells are affected by three-dimensional genome structure. The insulators play an important role in the organization of higher-order structure. CTCF is a representative of mammalian insulators, which can produce barriers to prevent the continuous extrusion of chromatin loop. As a multifunctional protein, CTCF has tens of thousands of binding sites in the genome, but only a portion of them can be used as anchors of chromatin loops. It is still unclear how cells select the anchor in the process of chromatin looping. Methods: In this paper, a comparative analysis is performed to investigate the sequence preference and binding strength of anchor and non-anchor CTCF binding sites. Furthermore, a machine learning model based on the CTCF binding intensity and DNA sequence is proposed to predict which CTCF sites can form chromatin loop anchors. Results: The accuracy of the machine learning model that we constructed for predicting the anchor of the chromatin loop mediated by CTCF reached 0.8646. And we find that the formation of loop anchor is mainly influenced by the CTCF binding strength and binding pattern (which can be interpreted as the binding of different zinc fingers). Discussion: In conclusion, our results suggest that The CTCF core motif and it’s flanking sequence may be responsible for the binding specificity. This work contributes to understanding the mechanism of loop anchor selection and provides a reference for the prediction of CTCF-mediated chromatin loops. |
format | Online Article Text |
id | pubmed-10106609 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-101066092023-04-18 Prediction of CTCF loop anchor based on machine learning Zhang, Xiao Zhu, Wen Sun, Huimin Ding, Yijie Liu, Li Front Genet Genetics Introduction: Various activities in biological cells are affected by three-dimensional genome structure. The insulators play an important role in the organization of higher-order structure. CTCF is a representative of mammalian insulators, which can produce barriers to prevent the continuous extrusion of chromatin loop. As a multifunctional protein, CTCF has tens of thousands of binding sites in the genome, but only a portion of them can be used as anchors of chromatin loops. It is still unclear how cells select the anchor in the process of chromatin looping. Methods: In this paper, a comparative analysis is performed to investigate the sequence preference and binding strength of anchor and non-anchor CTCF binding sites. Furthermore, a machine learning model based on the CTCF binding intensity and DNA sequence is proposed to predict which CTCF sites can form chromatin loop anchors. Results: The accuracy of the machine learning model that we constructed for predicting the anchor of the chromatin loop mediated by CTCF reached 0.8646. And we find that the formation of loop anchor is mainly influenced by the CTCF binding strength and binding pattern (which can be interpreted as the binding of different zinc fingers). Discussion: In conclusion, our results suggest that The CTCF core motif and it’s flanking sequence may be responsible for the binding specificity. This work contributes to understanding the mechanism of loop anchor selection and provides a reference for the prediction of CTCF-mediated chromatin loops. Frontiers Media S.A. 2023-04-03 /pmc/articles/PMC10106609/ /pubmed/37077544 http://dx.doi.org/10.3389/fgene.2023.1181956 Text en Copyright © 2023 Zhang, Zhu, Sun, Ding and Liu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Zhang, Xiao Zhu, Wen Sun, Huimin Ding, Yijie Liu, Li Prediction of CTCF loop anchor based on machine learning |
title | Prediction of CTCF loop anchor based on machine learning |
title_full | Prediction of CTCF loop anchor based on machine learning |
title_fullStr | Prediction of CTCF loop anchor based on machine learning |
title_full_unstemmed | Prediction of CTCF loop anchor based on machine learning |
title_short | Prediction of CTCF loop anchor based on machine learning |
title_sort | prediction of ctcf loop anchor based on machine learning |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10106609/ https://www.ncbi.nlm.nih.gov/pubmed/37077544 http://dx.doi.org/10.3389/fgene.2023.1181956 |
work_keys_str_mv | AT zhangxiao predictionofctcfloopanchorbasedonmachinelearning AT zhuwen predictionofctcfloopanchorbasedonmachinelearning AT sunhuimin predictionofctcfloopanchorbasedonmachinelearning AT dingyijie predictionofctcfloopanchorbasedonmachinelearning AT liuli predictionofctcfloopanchorbasedonmachinelearning |