Cargando…
BoT-Net: a lightweight bag of tricks-based neural network for efficient LncRNA–miRNA interaction prediction
BACKGROUND AND OBJECTIVE: Interactions of long non-coding ribonucleic acids (lncRNAs) with micro-ribonucleic acids (miRNAs) play an essential role in gene regulation, cellular metabolic, and pathological processes. Existing purely sequence based computational approaches lack robustness and efficienc...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Nature Singapore
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9581873/ https://www.ncbi.nlm.nih.gov/pubmed/35947255 http://dx.doi.org/10.1007/s12539-022-00535-x |
_version_ | 1784812725526855680 |
---|---|
author | Asim, Muhammad Nabeel Ibrahim, Muhammad Ali Zehe, Christoph Trygg, Johan Dengel, Andreas Ahmed, Sheraz |
author_facet | Asim, Muhammad Nabeel Ibrahim, Muhammad Ali Zehe, Christoph Trygg, Johan Dengel, Andreas Ahmed, Sheraz |
author_sort | Asim, Muhammad Nabeel |
collection | PubMed |
description | BACKGROUND AND OBJECTIVE: Interactions of long non-coding ribonucleic acids (lncRNAs) with micro-ribonucleic acids (miRNAs) play an essential role in gene regulation, cellular metabolic, and pathological processes. Existing purely sequence based computational approaches lack robustness and efficiency mainly due to the high length variability of lncRNA sequences. Hence, the prime focus of the current study is to find optimal length trade-offs between highly flexible length lncRNA sequences. METHOD: The paper at hand performs in-depth exploration of diverse copy padding, sequence truncation approaches, and presents a novel idea of utilizing only subregions of lncRNA sequences to generate fixed-length lncRNA sequences. Furthermore, it presents a novel bag of tricks-based deep learning approach “Bot-Net” which leverages a single layer long-short-term memory network regularized through DropConnect to capture higher order residue dependencies, pooling to retain most salient features, normalization to prevent exploding and vanishing gradient issues, learning rate decay, and dropout to regularize precise neural network for lncRNA–miRNA interaction prediction. RESULTS: BoT-Net outperforms the state-of-the-art lncRNA–miRNA interaction prediction approach by 2%, 8%, and 4% in terms of accuracy, specificity, and matthews correlation coefficient. Furthermore, a case study analysis indicates that BoT-Net also outperforms state-of-the-art lncRNA–protein interaction predictor on a benchmark dataset by accuracy of 10%, sensitivity of 19%, specificity of 6%, precision of 14%, and matthews correlation coefficient of 26%. CONCLUSION: In the benchmark lncRNA–miRNA interaction prediction dataset, the length of the lncRNA sequence varies from 213 residues to 22,743 residues and in the benchmark lncRNA–protein interaction prediction dataset, lncRNA sequences vary from 15 residues to 1504 residues. For such highly flexible length sequences, fixed length generation using copy padding introduces a significant level of bias which makes a large number of lncRNA sequences very much identical to each other and eventually derail classifier generalizeability. Empirical evaluation reveals that within 50 residues of only the starting region of long lncRNA sequences, a highly informative distribution for lncRNA–miRNA interaction prediction is contained, a crucial finding exploited by the proposed BoT-Net approach to optimize the lncRNA fixed length generation process. AVAILABILITY: BoT-Net web server can be accessed at https://sds_genetic_analysis.opendfki.de/lncmiRNA/. GRAPHIC ABSTRACT: [Image: see text] |
format | Online Article Text |
id | pubmed-9581873 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer Nature Singapore |
record_format | MEDLINE/PubMed |
spelling | pubmed-95818732022-10-21 BoT-Net: a lightweight bag of tricks-based neural network for efficient LncRNA–miRNA interaction prediction Asim, Muhammad Nabeel Ibrahim, Muhammad Ali Zehe, Christoph Trygg, Johan Dengel, Andreas Ahmed, Sheraz Interdiscip Sci Original Research Article BACKGROUND AND OBJECTIVE: Interactions of long non-coding ribonucleic acids (lncRNAs) with micro-ribonucleic acids (miRNAs) play an essential role in gene regulation, cellular metabolic, and pathological processes. Existing purely sequence based computational approaches lack robustness and efficiency mainly due to the high length variability of lncRNA sequences. Hence, the prime focus of the current study is to find optimal length trade-offs between highly flexible length lncRNA sequences. METHOD: The paper at hand performs in-depth exploration of diverse copy padding, sequence truncation approaches, and presents a novel idea of utilizing only subregions of lncRNA sequences to generate fixed-length lncRNA sequences. Furthermore, it presents a novel bag of tricks-based deep learning approach “Bot-Net” which leverages a single layer long-short-term memory network regularized through DropConnect to capture higher order residue dependencies, pooling to retain most salient features, normalization to prevent exploding and vanishing gradient issues, learning rate decay, and dropout to regularize precise neural network for lncRNA–miRNA interaction prediction. RESULTS: BoT-Net outperforms the state-of-the-art lncRNA–miRNA interaction prediction approach by 2%, 8%, and 4% in terms of accuracy, specificity, and matthews correlation coefficient. Furthermore, a case study analysis indicates that BoT-Net also outperforms state-of-the-art lncRNA–protein interaction predictor on a benchmark dataset by accuracy of 10%, sensitivity of 19%, specificity of 6%, precision of 14%, and matthews correlation coefficient of 26%. CONCLUSION: In the benchmark lncRNA–miRNA interaction prediction dataset, the length of the lncRNA sequence varies from 213 residues to 22,743 residues and in the benchmark lncRNA–protein interaction prediction dataset, lncRNA sequences vary from 15 residues to 1504 residues. For such highly flexible length sequences, fixed length generation using copy padding introduces a significant level of bias which makes a large number of lncRNA sequences very much identical to each other and eventually derail classifier generalizeability. Empirical evaluation reveals that within 50 residues of only the starting region of long lncRNA sequences, a highly informative distribution for lncRNA–miRNA interaction prediction is contained, a crucial finding exploited by the proposed BoT-Net approach to optimize the lncRNA fixed length generation process. AVAILABILITY: BoT-Net web server can be accessed at https://sds_genetic_analysis.opendfki.de/lncmiRNA/. GRAPHIC ABSTRACT: [Image: see text] Springer Nature Singapore 2022-08-10 2022 /pmc/articles/PMC9581873/ /pubmed/35947255 http://dx.doi.org/10.1007/s12539-022-00535-x Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Original Research Article Asim, Muhammad Nabeel Ibrahim, Muhammad Ali Zehe, Christoph Trygg, Johan Dengel, Andreas Ahmed, Sheraz BoT-Net: a lightweight bag of tricks-based neural network for efficient LncRNA–miRNA interaction prediction |
title | BoT-Net: a lightweight bag of tricks-based neural network for efficient LncRNA–miRNA interaction prediction |
title_full | BoT-Net: a lightweight bag of tricks-based neural network for efficient LncRNA–miRNA interaction prediction |
title_fullStr | BoT-Net: a lightweight bag of tricks-based neural network for efficient LncRNA–miRNA interaction prediction |
title_full_unstemmed | BoT-Net: a lightweight bag of tricks-based neural network for efficient LncRNA–miRNA interaction prediction |
title_short | BoT-Net: a lightweight bag of tricks-based neural network for efficient LncRNA–miRNA interaction prediction |
title_sort | bot-net: a lightweight bag of tricks-based neural network for efficient lncrna–mirna interaction prediction |
topic | Original Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9581873/ https://www.ncbi.nlm.nih.gov/pubmed/35947255 http://dx.doi.org/10.1007/s12539-022-00535-x |
work_keys_str_mv | AT asimmuhammadnabeel botnetalightweightbagoftricksbasedneuralnetworkforefficientlncrnamirnainteractionprediction AT ibrahimmuhammadali botnetalightweightbagoftricksbasedneuralnetworkforefficientlncrnamirnainteractionprediction AT zehechristoph botnetalightweightbagoftricksbasedneuralnetworkforefficientlncrnamirnainteractionprediction AT tryggjohan botnetalightweightbagoftricksbasedneuralnetworkforefficientlncrnamirnainteractionprediction AT dengelandreas botnetalightweightbagoftricksbasedneuralnetworkforefficientlncrnamirnainteractionprediction AT ahmedsheraz botnetalightweightbagoftricksbasedneuralnetworkforefficientlncrnamirnainteractionprediction |