Cargando…
Identification and analysis of consensus RNA motifs binding to the genome regulator CTCF
CCCTC-binding factor (CTCF) is a key regulator of 3D genome organization and gene expression. Recent studies suggest that RNA transcripts, mostly long non-coding RNAs (lncRNAs), can serve as locus-specific factors to bind and recruit CTCF to the chromatin. However, it remains unclear whether specifi...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7671415/ https://www.ncbi.nlm.nih.gov/pubmed/33575587 http://dx.doi.org/10.1093/nargab/lqaa031 |
_version_ | 1783610924456738816 |
---|---|
author | Kuang, Shuzhen Wang, Liangjiang |
author_facet | Kuang, Shuzhen Wang, Liangjiang |
author_sort | Kuang, Shuzhen |
collection | PubMed |
description | CCCTC-binding factor (CTCF) is a key regulator of 3D genome organization and gene expression. Recent studies suggest that RNA transcripts, mostly long non-coding RNAs (lncRNAs), can serve as locus-specific factors to bind and recruit CTCF to the chromatin. However, it remains unclear whether specific sequence patterns are shared by the CTCF-binding RNA sites, and no RNA motif has been reported so far for CTCF binding. In this study, we have developed DeepLncCTCF, a new deep learning model based on a convolutional neural network and a bidirectional long short-term memory network, to discover the RNA recognition patterns of CTCF and identify candidate lncRNAs binding to CTCF. When evaluated on two different datasets, human U2OS dataset and mouse ESC dataset, DeepLncCTCF was shown to be able to accurately predict CTCF-binding RNA sites from nucleotide sequence. By examining the sequence features learned by DeepLncCTCF, we discovered a novel RNA motif with the consensus sequence, AGAUNGGA, for potential CTCF binding in humans. Furthermore, the applicability of DeepLncCTCF was demonstrated by identifying nearly 5000 candidate lncRNAs that might bind to CTCF in the nucleus. Our results provide useful information for understanding the molecular mechanisms of CTCF function in 3D genome organization. |
format | Online Article Text |
id | pubmed-7671415 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-76714152021-02-10 Identification and analysis of consensus RNA motifs binding to the genome regulator CTCF Kuang, Shuzhen Wang, Liangjiang NAR Genom Bioinform Standard Article CCCTC-binding factor (CTCF) is a key regulator of 3D genome organization and gene expression. Recent studies suggest that RNA transcripts, mostly long non-coding RNAs (lncRNAs), can serve as locus-specific factors to bind and recruit CTCF to the chromatin. However, it remains unclear whether specific sequence patterns are shared by the CTCF-binding RNA sites, and no RNA motif has been reported so far for CTCF binding. In this study, we have developed DeepLncCTCF, a new deep learning model based on a convolutional neural network and a bidirectional long short-term memory network, to discover the RNA recognition patterns of CTCF and identify candidate lncRNAs binding to CTCF. When evaluated on two different datasets, human U2OS dataset and mouse ESC dataset, DeepLncCTCF was shown to be able to accurately predict CTCF-binding RNA sites from nucleotide sequence. By examining the sequence features learned by DeepLncCTCF, we discovered a novel RNA motif with the consensus sequence, AGAUNGGA, for potential CTCF binding in humans. Furthermore, the applicability of DeepLncCTCF was demonstrated by identifying nearly 5000 candidate lncRNAs that might bind to CTCF in the nucleus. Our results provide useful information for understanding the molecular mechanisms of CTCF function in 3D genome organization. Oxford University Press 2020-05-06 /pmc/articles/PMC7671415/ /pubmed/33575587 http://dx.doi.org/10.1093/nargab/lqaa031 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Standard Article Kuang, Shuzhen Wang, Liangjiang Identification and analysis of consensus RNA motifs binding to the genome regulator CTCF |
title | Identification and analysis of consensus RNA motifs binding to the genome regulator CTCF |
title_full | Identification and analysis of consensus RNA motifs binding to the genome regulator CTCF |
title_fullStr | Identification and analysis of consensus RNA motifs binding to the genome regulator CTCF |
title_full_unstemmed | Identification and analysis of consensus RNA motifs binding to the genome regulator CTCF |
title_short | Identification and analysis of consensus RNA motifs binding to the genome regulator CTCF |
title_sort | identification and analysis of consensus rna motifs binding to the genome regulator ctcf |
topic | Standard Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7671415/ https://www.ncbi.nlm.nih.gov/pubmed/33575587 http://dx.doi.org/10.1093/nargab/lqaa031 |
work_keys_str_mv | AT kuangshuzhen identificationandanalysisofconsensusrnamotifsbindingtothegenomeregulatorctcf AT wangliangjiang identificationandanalysisofconsensusrnamotifsbindingtothegenomeregulatorctcf |