Cargando…

Identification and analysis of consensus RNA motifs binding to the genome regulator CTCF

CCCTC-binding factor (CTCF) is a key regulator of 3D genome organization and gene expression. Recent studies suggest that RNA transcripts, mostly long non-coding RNAs (lncRNAs), can serve as locus-specific factors to bind and recruit CTCF to the chromatin. However, it remains unclear whether specifi...

Descripción completa

Detalles Bibliográficos
Autores principales: Kuang, Shuzhen, Wang, Liangjiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7671415/
https://www.ncbi.nlm.nih.gov/pubmed/33575587
http://dx.doi.org/10.1093/nargab/lqaa031
_version_ 1783610924456738816
author Kuang, Shuzhen
Wang, Liangjiang
author_facet Kuang, Shuzhen
Wang, Liangjiang
author_sort Kuang, Shuzhen
collection PubMed
description CCCTC-binding factor (CTCF) is a key regulator of 3D genome organization and gene expression. Recent studies suggest that RNA transcripts, mostly long non-coding RNAs (lncRNAs), can serve as locus-specific factors to bind and recruit CTCF to the chromatin. However, it remains unclear whether specific sequence patterns are shared by the CTCF-binding RNA sites, and no RNA motif has been reported so far for CTCF binding. In this study, we have developed DeepLncCTCF, a new deep learning model based on a convolutional neural network and a bidirectional long short-term memory network, to discover the RNA recognition patterns of CTCF and identify candidate lncRNAs binding to CTCF. When evaluated on two different datasets, human U2OS dataset and mouse ESC dataset, DeepLncCTCF was shown to be able to accurately predict CTCF-binding RNA sites from nucleotide sequence. By examining the sequence features learned by DeepLncCTCF, we discovered a novel RNA motif with the consensus sequence, AGAUNGGA, for potential CTCF binding in humans. Furthermore, the applicability of DeepLncCTCF was demonstrated by identifying nearly 5000 candidate lncRNAs that might bind to CTCF in the nucleus. Our results provide useful information for understanding the molecular mechanisms of CTCF function in 3D genome organization.
format Online
Article
Text
id pubmed-7671415
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-76714152021-02-10 Identification and analysis of consensus RNA motifs binding to the genome regulator CTCF Kuang, Shuzhen Wang, Liangjiang NAR Genom Bioinform Standard Article CCCTC-binding factor (CTCF) is a key regulator of 3D genome organization and gene expression. Recent studies suggest that RNA transcripts, mostly long non-coding RNAs (lncRNAs), can serve as locus-specific factors to bind and recruit CTCF to the chromatin. However, it remains unclear whether specific sequence patterns are shared by the CTCF-binding RNA sites, and no RNA motif has been reported so far for CTCF binding. In this study, we have developed DeepLncCTCF, a new deep learning model based on a convolutional neural network and a bidirectional long short-term memory network, to discover the RNA recognition patterns of CTCF and identify candidate lncRNAs binding to CTCF. When evaluated on two different datasets, human U2OS dataset and mouse ESC dataset, DeepLncCTCF was shown to be able to accurately predict CTCF-binding RNA sites from nucleotide sequence. By examining the sequence features learned by DeepLncCTCF, we discovered a novel RNA motif with the consensus sequence, AGAUNGGA, for potential CTCF binding in humans. Furthermore, the applicability of DeepLncCTCF was demonstrated by identifying nearly 5000 candidate lncRNAs that might bind to CTCF in the nucleus. Our results provide useful information for understanding the molecular mechanisms of CTCF function in 3D genome organization. Oxford University Press 2020-05-06 /pmc/articles/PMC7671415/ /pubmed/33575587 http://dx.doi.org/10.1093/nargab/lqaa031 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Standard Article
Kuang, Shuzhen
Wang, Liangjiang
Identification and analysis of consensus RNA motifs binding to the genome regulator CTCF
title Identification and analysis of consensus RNA motifs binding to the genome regulator CTCF
title_full Identification and analysis of consensus RNA motifs binding to the genome regulator CTCF
title_fullStr Identification and analysis of consensus RNA motifs binding to the genome regulator CTCF
title_full_unstemmed Identification and analysis of consensus RNA motifs binding to the genome regulator CTCF
title_short Identification and analysis of consensus RNA motifs binding to the genome regulator CTCF
title_sort identification and analysis of consensus rna motifs binding to the genome regulator ctcf
topic Standard Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7671415/
https://www.ncbi.nlm.nih.gov/pubmed/33575587
http://dx.doi.org/10.1093/nargab/lqaa031
work_keys_str_mv AT kuangshuzhen identificationandanalysisofconsensusrnamotifsbindingtothegenomeregulatorctcf
AT wangliangjiang identificationandanalysisofconsensusrnamotifsbindingtothegenomeregulatorctcf