Cargando…

CircSSNN: circRNA-binding site prediction via sequence self-attention neural networks with pre-normalization

BACKGROUND: Circular RNAs (circRNAs) play a significant role in some diseases by acting as transcription templates. Therefore, analyzing the interaction mechanism between circRNA and RNA-binding proteins (RBPs) has far-reaching implications for the prevention and treatment of diseases. Existing mode...

Descripción completa

Detalles Bibliográficos
Autores principales: Cao, Chao, Yang, Shuhong, Li, Mengli, Li, Chungui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10230723/
https://www.ncbi.nlm.nih.gov/pubmed/37254080
http://dx.doi.org/10.1186/s12859-023-05352-7
_version_ 1785051596634193920
author Cao, Chao
Yang, Shuhong
Li, Mengli
Li, Chungui
author_facet Cao, Chao
Yang, Shuhong
Li, Mengli
Li, Chungui
author_sort Cao, Chao
collection PubMed
description BACKGROUND: Circular RNAs (circRNAs) play a significant role in some diseases by acting as transcription templates. Therefore, analyzing the interaction mechanism between circRNA and RNA-binding proteins (RBPs) has far-reaching implications for the prevention and treatment of diseases. Existing models for circRNA-RBP identification usually adopt convolution neural network (CNN), recurrent neural network (RNN), or their variants as feature extractors. Most of them have drawbacks such as poor parallelism, insufficient stability, and inability to capture long-term dependencies. METHODS: In this paper, we propose a new method completely using the self-attention mechanism to capture deep semantic features of RNA sequences. On this basis, we construct a CircSSNN model for the cirRNA-RBP identification. The proposed model constructs a feature scheme by fusing circRNA sequence representations with statistical distributions, static local contexts, and dynamic global contexts. With a stable and efficient network architecture, the distance between any two positions in a sequence is reduced to a constant, so CircSSNN can quickly capture the long-term dependencies and extract the deep semantic features. RESULTS: Experiments on 37 circRNA datasets show that the proposed model has overall advantages in stability, parallelism, and prediction performance. Keeping the network structure and hyperparameters unchanged, we directly apply the CircSSNN to linRNA datasets. The favorable results show that CircSSNN can be transformed simply and efficiently without task-oriented tuning. CONCLUSIONS: In conclusion, CircSSNN can serve as an appealing circRNA-RBP identification tool with good identification performance, excellent scalability, and wide application scope without the need for task-oriented fine-tuning of parameters, which is expected to reduce the professional threshold required for hyperparameter tuning in bioinformatics analysis.
format Online
Article
Text
id pubmed-10230723
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-102307232023-06-01 CircSSNN: circRNA-binding site prediction via sequence self-attention neural networks with pre-normalization Cao, Chao Yang, Shuhong Li, Mengli Li, Chungui BMC Bioinformatics Research BACKGROUND: Circular RNAs (circRNAs) play a significant role in some diseases by acting as transcription templates. Therefore, analyzing the interaction mechanism between circRNA and RNA-binding proteins (RBPs) has far-reaching implications for the prevention and treatment of diseases. Existing models for circRNA-RBP identification usually adopt convolution neural network (CNN), recurrent neural network (RNN), or their variants as feature extractors. Most of them have drawbacks such as poor parallelism, insufficient stability, and inability to capture long-term dependencies. METHODS: In this paper, we propose a new method completely using the self-attention mechanism to capture deep semantic features of RNA sequences. On this basis, we construct a CircSSNN model for the cirRNA-RBP identification. The proposed model constructs a feature scheme by fusing circRNA sequence representations with statistical distributions, static local contexts, and dynamic global contexts. With a stable and efficient network architecture, the distance between any two positions in a sequence is reduced to a constant, so CircSSNN can quickly capture the long-term dependencies and extract the deep semantic features. RESULTS: Experiments on 37 circRNA datasets show that the proposed model has overall advantages in stability, parallelism, and prediction performance. Keeping the network structure and hyperparameters unchanged, we directly apply the CircSSNN to linRNA datasets. The favorable results show that CircSSNN can be transformed simply and efficiently without task-oriented tuning. CONCLUSIONS: In conclusion, CircSSNN can serve as an appealing circRNA-RBP identification tool with good identification performance, excellent scalability, and wide application scope without the need for task-oriented fine-tuning of parameters, which is expected to reduce the professional threshold required for hyperparameter tuning in bioinformatics analysis. BioMed Central 2023-05-30 /pmc/articles/PMC10230723/ /pubmed/37254080 http://dx.doi.org/10.1186/s12859-023-05352-7 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Cao, Chao
Yang, Shuhong
Li, Mengli
Li, Chungui
CircSSNN: circRNA-binding site prediction via sequence self-attention neural networks with pre-normalization
title CircSSNN: circRNA-binding site prediction via sequence self-attention neural networks with pre-normalization
title_full CircSSNN: circRNA-binding site prediction via sequence self-attention neural networks with pre-normalization
title_fullStr CircSSNN: circRNA-binding site prediction via sequence self-attention neural networks with pre-normalization
title_full_unstemmed CircSSNN: circRNA-binding site prediction via sequence self-attention neural networks with pre-normalization
title_short CircSSNN: circRNA-binding site prediction via sequence self-attention neural networks with pre-normalization
title_sort circssnn: circrna-binding site prediction via sequence self-attention neural networks with pre-normalization
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10230723/
https://www.ncbi.nlm.nih.gov/pubmed/37254080
http://dx.doi.org/10.1186/s12859-023-05352-7
work_keys_str_mv AT caochao circssnncircrnabindingsitepredictionviasequenceselfattentionneuralnetworkswithprenormalization
AT yangshuhong circssnncircrnabindingsitepredictionviasequenceselfattentionneuralnetworkswithprenormalization
AT limengli circssnncircrnabindingsitepredictionviasequenceselfattentionneuralnetworkswithprenormalization
AT lichungui circssnncircrnabindingsitepredictionviasequenceselfattentionneuralnetworkswithprenormalization