Cargando…

EDLMFC: an ensemble deep learning framework with multi-scale features combination for ncRNA–protein interaction prediction

BACKGROUND: Non-coding RNA (ncRNA) and protein interactions play essential roles in various physiological and pathological processes. The experimental methods used for predicting ncRNA–protein interactions are time-consuming and labor-intensive. Therefore, there is an increasing demand for computati...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Jingjing, Zhao, Yanpeng, Gong, Weikang, Liu, Yang, Wang, Mei, Huang, Xiaoqian, Tan, Jianjun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7980572/
https://www.ncbi.nlm.nih.gov/pubmed/33740884
http://dx.doi.org/10.1186/s12859-021-04069-9
_version_ 1783667455052218368
author Wang, Jingjing
Zhao, Yanpeng
Gong, Weikang
Liu, Yang
Wang, Mei
Huang, Xiaoqian
Tan, Jianjun
author_facet Wang, Jingjing
Zhao, Yanpeng
Gong, Weikang
Liu, Yang
Wang, Mei
Huang, Xiaoqian
Tan, Jianjun
author_sort Wang, Jingjing
collection PubMed
description BACKGROUND: Non-coding RNA (ncRNA) and protein interactions play essential roles in various physiological and pathological processes. The experimental methods used for predicting ncRNA–protein interactions are time-consuming and labor-intensive. Therefore, there is an increasing demand for computational methods to accurately and efficiently predict ncRNA–protein interactions. RESULTS: In this work, we presented an ensemble deep learning-based method, EDLMFC, to predict ncRNA–protein interactions using the combination of multi-scale features, including primary sequence features, secondary structure sequence features, and tertiary structure features. Conjoint k-mer was used to extract protein/ncRNA sequence features, integrating tertiary structure features, then fed into an ensemble deep learning model, which combined convolutional neural network (CNN) to learn dominating biological information with bi-directional long short-term memory network (BLSTM) to capture long-range dependencies among the features identified by the CNN. Compared with other state-of-the-art methods under five-fold cross-validation, EDLMFC shows the best performance with accuracy of 93.8%, 89.7%, and 86.1% on RPI1807, NPInter v2.0, and RPI488 datasets, respectively. The results of the independent test demonstrated that EDLMFC can effectively predict potential ncRNA–protein interactions from different organisms. Furtherly, EDLMFC is also shown to predict hub ncRNAs and proteins presented in ncRNA–protein networks of Mus musculus successfully. CONCLUSIONS: In general, our proposed method EDLMFC improved the accuracy of ncRNA–protein interaction predictions and anticipated providing some helpful guidance on ncRNA functions research. The source code of EDLMFC and the datasets used in this work are available at https://github.com/JingjingWang-87/EDLMFC. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04069-9.
format Online
Article
Text
id pubmed-7980572
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-79805722021-03-22 EDLMFC: an ensemble deep learning framework with multi-scale features combination for ncRNA–protein interaction prediction Wang, Jingjing Zhao, Yanpeng Gong, Weikang Liu, Yang Wang, Mei Huang, Xiaoqian Tan, Jianjun BMC Bioinformatics Research Article BACKGROUND: Non-coding RNA (ncRNA) and protein interactions play essential roles in various physiological and pathological processes. The experimental methods used for predicting ncRNA–protein interactions are time-consuming and labor-intensive. Therefore, there is an increasing demand for computational methods to accurately and efficiently predict ncRNA–protein interactions. RESULTS: In this work, we presented an ensemble deep learning-based method, EDLMFC, to predict ncRNA–protein interactions using the combination of multi-scale features, including primary sequence features, secondary structure sequence features, and tertiary structure features. Conjoint k-mer was used to extract protein/ncRNA sequence features, integrating tertiary structure features, then fed into an ensemble deep learning model, which combined convolutional neural network (CNN) to learn dominating biological information with bi-directional long short-term memory network (BLSTM) to capture long-range dependencies among the features identified by the CNN. Compared with other state-of-the-art methods under five-fold cross-validation, EDLMFC shows the best performance with accuracy of 93.8%, 89.7%, and 86.1% on RPI1807, NPInter v2.0, and RPI488 datasets, respectively. The results of the independent test demonstrated that EDLMFC can effectively predict potential ncRNA–protein interactions from different organisms. Furtherly, EDLMFC is also shown to predict hub ncRNAs and proteins presented in ncRNA–protein networks of Mus musculus successfully. CONCLUSIONS: In general, our proposed method EDLMFC improved the accuracy of ncRNA–protein interaction predictions and anticipated providing some helpful guidance on ncRNA functions research. The source code of EDLMFC and the datasets used in this work are available at https://github.com/JingjingWang-87/EDLMFC. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04069-9. BioMed Central 2021-03-19 /pmc/articles/PMC7980572/ /pubmed/33740884 http://dx.doi.org/10.1186/s12859-021-04069-9 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Wang, Jingjing
Zhao, Yanpeng
Gong, Weikang
Liu, Yang
Wang, Mei
Huang, Xiaoqian
Tan, Jianjun
EDLMFC: an ensemble deep learning framework with multi-scale features combination for ncRNA–protein interaction prediction
title EDLMFC: an ensemble deep learning framework with multi-scale features combination for ncRNA–protein interaction prediction
title_full EDLMFC: an ensemble deep learning framework with multi-scale features combination for ncRNA–protein interaction prediction
title_fullStr EDLMFC: an ensemble deep learning framework with multi-scale features combination for ncRNA–protein interaction prediction
title_full_unstemmed EDLMFC: an ensemble deep learning framework with multi-scale features combination for ncRNA–protein interaction prediction
title_short EDLMFC: an ensemble deep learning framework with multi-scale features combination for ncRNA–protein interaction prediction
title_sort edlmfc: an ensemble deep learning framework with multi-scale features combination for ncrna–protein interaction prediction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7980572/
https://www.ncbi.nlm.nih.gov/pubmed/33740884
http://dx.doi.org/10.1186/s12859-021-04069-9
work_keys_str_mv AT wangjingjing edlmfcanensembledeeplearningframeworkwithmultiscalefeaturescombinationforncrnaproteininteractionprediction
AT zhaoyanpeng edlmfcanensembledeeplearningframeworkwithmultiscalefeaturescombinationforncrnaproteininteractionprediction
AT gongweikang edlmfcanensembledeeplearningframeworkwithmultiscalefeaturescombinationforncrnaproteininteractionprediction
AT liuyang edlmfcanensembledeeplearningframeworkwithmultiscalefeaturescombinationforncrnaproteininteractionprediction
AT wangmei edlmfcanensembledeeplearningframeworkwithmultiscalefeaturescombinationforncrnaproteininteractionprediction
AT huangxiaoqian edlmfcanensembledeeplearningframeworkwithmultiscalefeaturescombinationforncrnaproteininteractionprediction
AT tanjianjun edlmfcanensembledeeplearningframeworkwithmultiscalefeaturescombinationforncrnaproteininteractionprediction