Cargando…
EDLMFC: an ensemble deep learning framework with multi-scale features combination for ncRNA–protein interaction prediction
BACKGROUND: Non-coding RNA (ncRNA) and protein interactions play essential roles in various physiological and pathological processes. The experimental methods used for predicting ncRNA–protein interactions are time-consuming and labor-intensive. Therefore, there is an increasing demand for computati...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7980572/ https://www.ncbi.nlm.nih.gov/pubmed/33740884 http://dx.doi.org/10.1186/s12859-021-04069-9 |
_version_ | 1783667455052218368 |
---|---|
author | Wang, Jingjing Zhao, Yanpeng Gong, Weikang Liu, Yang Wang, Mei Huang, Xiaoqian Tan, Jianjun |
author_facet | Wang, Jingjing Zhao, Yanpeng Gong, Weikang Liu, Yang Wang, Mei Huang, Xiaoqian Tan, Jianjun |
author_sort | Wang, Jingjing |
collection | PubMed |
description | BACKGROUND: Non-coding RNA (ncRNA) and protein interactions play essential roles in various physiological and pathological processes. The experimental methods used for predicting ncRNA–protein interactions are time-consuming and labor-intensive. Therefore, there is an increasing demand for computational methods to accurately and efficiently predict ncRNA–protein interactions. RESULTS: In this work, we presented an ensemble deep learning-based method, EDLMFC, to predict ncRNA–protein interactions using the combination of multi-scale features, including primary sequence features, secondary structure sequence features, and tertiary structure features. Conjoint k-mer was used to extract protein/ncRNA sequence features, integrating tertiary structure features, then fed into an ensemble deep learning model, which combined convolutional neural network (CNN) to learn dominating biological information with bi-directional long short-term memory network (BLSTM) to capture long-range dependencies among the features identified by the CNN. Compared with other state-of-the-art methods under five-fold cross-validation, EDLMFC shows the best performance with accuracy of 93.8%, 89.7%, and 86.1% on RPI1807, NPInter v2.0, and RPI488 datasets, respectively. The results of the independent test demonstrated that EDLMFC can effectively predict potential ncRNA–protein interactions from different organisms. Furtherly, EDLMFC is also shown to predict hub ncRNAs and proteins presented in ncRNA–protein networks of Mus musculus successfully. CONCLUSIONS: In general, our proposed method EDLMFC improved the accuracy of ncRNA–protein interaction predictions and anticipated providing some helpful guidance on ncRNA functions research. The source code of EDLMFC and the datasets used in this work are available at https://github.com/JingjingWang-87/EDLMFC. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04069-9. |
format | Online Article Text |
id | pubmed-7980572 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-79805722021-03-22 EDLMFC: an ensemble deep learning framework with multi-scale features combination for ncRNA–protein interaction prediction Wang, Jingjing Zhao, Yanpeng Gong, Weikang Liu, Yang Wang, Mei Huang, Xiaoqian Tan, Jianjun BMC Bioinformatics Research Article BACKGROUND: Non-coding RNA (ncRNA) and protein interactions play essential roles in various physiological and pathological processes. The experimental methods used for predicting ncRNA–protein interactions are time-consuming and labor-intensive. Therefore, there is an increasing demand for computational methods to accurately and efficiently predict ncRNA–protein interactions. RESULTS: In this work, we presented an ensemble deep learning-based method, EDLMFC, to predict ncRNA–protein interactions using the combination of multi-scale features, including primary sequence features, secondary structure sequence features, and tertiary structure features. Conjoint k-mer was used to extract protein/ncRNA sequence features, integrating tertiary structure features, then fed into an ensemble deep learning model, which combined convolutional neural network (CNN) to learn dominating biological information with bi-directional long short-term memory network (BLSTM) to capture long-range dependencies among the features identified by the CNN. Compared with other state-of-the-art methods under five-fold cross-validation, EDLMFC shows the best performance with accuracy of 93.8%, 89.7%, and 86.1% on RPI1807, NPInter v2.0, and RPI488 datasets, respectively. The results of the independent test demonstrated that EDLMFC can effectively predict potential ncRNA–protein interactions from different organisms. Furtherly, EDLMFC is also shown to predict hub ncRNAs and proteins presented in ncRNA–protein networks of Mus musculus successfully. CONCLUSIONS: In general, our proposed method EDLMFC improved the accuracy of ncRNA–protein interaction predictions and anticipated providing some helpful guidance on ncRNA functions research. The source code of EDLMFC and the datasets used in this work are available at https://github.com/JingjingWang-87/EDLMFC. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04069-9. BioMed Central 2021-03-19 /pmc/articles/PMC7980572/ /pubmed/33740884 http://dx.doi.org/10.1186/s12859-021-04069-9 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Wang, Jingjing Zhao, Yanpeng Gong, Weikang Liu, Yang Wang, Mei Huang, Xiaoqian Tan, Jianjun EDLMFC: an ensemble deep learning framework with multi-scale features combination for ncRNA–protein interaction prediction |
title | EDLMFC: an ensemble deep learning framework with multi-scale features combination for ncRNA–protein interaction prediction |
title_full | EDLMFC: an ensemble deep learning framework with multi-scale features combination for ncRNA–protein interaction prediction |
title_fullStr | EDLMFC: an ensemble deep learning framework with multi-scale features combination for ncRNA–protein interaction prediction |
title_full_unstemmed | EDLMFC: an ensemble deep learning framework with multi-scale features combination for ncRNA–protein interaction prediction |
title_short | EDLMFC: an ensemble deep learning framework with multi-scale features combination for ncRNA–protein interaction prediction |
title_sort | edlmfc: an ensemble deep learning framework with multi-scale features combination for ncrna–protein interaction prediction |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7980572/ https://www.ncbi.nlm.nih.gov/pubmed/33740884 http://dx.doi.org/10.1186/s12859-021-04069-9 |
work_keys_str_mv | AT wangjingjing edlmfcanensembledeeplearningframeworkwithmultiscalefeaturescombinationforncrnaproteininteractionprediction AT zhaoyanpeng edlmfcanensembledeeplearningframeworkwithmultiscalefeaturescombinationforncrnaproteininteractionprediction AT gongweikang edlmfcanensembledeeplearningframeworkwithmultiscalefeaturescombinationforncrnaproteininteractionprediction AT liuyang edlmfcanensembledeeplearningframeworkwithmultiscalefeaturescombinationforncrnaproteininteractionprediction AT wangmei edlmfcanensembledeeplearningframeworkwithmultiscalefeaturescombinationforncrnaproteininteractionprediction AT huangxiaoqian edlmfcanensembledeeplearningframeworkwithmultiscalefeaturescombinationforncrnaproteininteractionprediction AT tanjianjun edlmfcanensembledeeplearningframeworkwithmultiscalefeaturescombinationforncrnaproteininteractionprediction |