Cargando…

A Hybrid Prediction Method for Plant lncRNA-Protein Interaction

Long non-protein-coding RNAs (lncRNAs) identification and analysis are pervasive in transcriptome studies due to their roles in biological processes. In particular, lncRNA-protein interaction has plausible relevance to gene expression regulation and in cellular processes such as pathogen resistance...

Descripción completa

Detalles Bibliográficos
Autores principales: Wekesa, Jael Sanyanda, Luan, Yushi, Chen, Ming, Meng, Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6627874/
https://www.ncbi.nlm.nih.gov/pubmed/31151273
http://dx.doi.org/10.3390/cells8060521
_version_ 1783434835196379136
author Wekesa, Jael Sanyanda
Luan, Yushi
Chen, Ming
Meng, Jun
author_facet Wekesa, Jael Sanyanda
Luan, Yushi
Chen, Ming
Meng, Jun
author_sort Wekesa, Jael Sanyanda
collection PubMed
description Long non-protein-coding RNAs (lncRNAs) identification and analysis are pervasive in transcriptome studies due to their roles in biological processes. In particular, lncRNA-protein interaction has plausible relevance to gene expression regulation and in cellular processes such as pathogen resistance in plants. While lncRNA-protein interaction has been studied in animals, there has yet to be extensive research in plants. In this paper, we propose a novel plant lncRNA-protein interaction prediction method, namely PLRPIM, which combines deep learning and shallow machine learning methods. The selection of an optimal feature subset and subsequent efficient compression are significant challenges for deep learning models. The proposed method adopts k-mer and extracts high-level abstraction sequence-based features using stacked sparse autoencoder. Based on the extracted features, the fusion of random forest (RF) and light gradient boosting machine (LGBM) is used to build the prediction model. The performances are evaluated on Arabidopsis thaliana and Zea mays datasets. Results from experiments demonstrate PLRPIM’s superiority compared with other prediction tools on the two datasets. Based on 5-fold cross-validation, we obtain 89.98% and 93.44% accuracy, 0.954 and 0.982 AUC for Arabidopsis thaliana and Zea mays, respectively. PLRPIM predicts potential lncRNA-protein interaction pairs effectively, which can facilitate lncRNA related research including function prediction.
format Online
Article
Text
id pubmed-6627874
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-66278742019-07-23 A Hybrid Prediction Method for Plant lncRNA-Protein Interaction Wekesa, Jael Sanyanda Luan, Yushi Chen, Ming Meng, Jun Cells Article Long non-protein-coding RNAs (lncRNAs) identification and analysis are pervasive in transcriptome studies due to their roles in biological processes. In particular, lncRNA-protein interaction has plausible relevance to gene expression regulation and in cellular processes such as pathogen resistance in plants. While lncRNA-protein interaction has been studied in animals, there has yet to be extensive research in plants. In this paper, we propose a novel plant lncRNA-protein interaction prediction method, namely PLRPIM, which combines deep learning and shallow machine learning methods. The selection of an optimal feature subset and subsequent efficient compression are significant challenges for deep learning models. The proposed method adopts k-mer and extracts high-level abstraction sequence-based features using stacked sparse autoencoder. Based on the extracted features, the fusion of random forest (RF) and light gradient boosting machine (LGBM) is used to build the prediction model. The performances are evaluated on Arabidopsis thaliana and Zea mays datasets. Results from experiments demonstrate PLRPIM’s superiority compared with other prediction tools on the two datasets. Based on 5-fold cross-validation, we obtain 89.98% and 93.44% accuracy, 0.954 and 0.982 AUC for Arabidopsis thaliana and Zea mays, respectively. PLRPIM predicts potential lncRNA-protein interaction pairs effectively, which can facilitate lncRNA related research including function prediction. MDPI 2019-05-30 /pmc/articles/PMC6627874/ /pubmed/31151273 http://dx.doi.org/10.3390/cells8060521 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Wekesa, Jael Sanyanda
Luan, Yushi
Chen, Ming
Meng, Jun
A Hybrid Prediction Method for Plant lncRNA-Protein Interaction
title A Hybrid Prediction Method for Plant lncRNA-Protein Interaction
title_full A Hybrid Prediction Method for Plant lncRNA-Protein Interaction
title_fullStr A Hybrid Prediction Method for Plant lncRNA-Protein Interaction
title_full_unstemmed A Hybrid Prediction Method for Plant lncRNA-Protein Interaction
title_short A Hybrid Prediction Method for Plant lncRNA-Protein Interaction
title_sort hybrid prediction method for plant lncrna-protein interaction
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6627874/
https://www.ncbi.nlm.nih.gov/pubmed/31151273
http://dx.doi.org/10.3390/cells8060521
work_keys_str_mv AT wekesajaelsanyanda ahybridpredictionmethodforplantlncrnaproteininteraction
AT luanyushi ahybridpredictionmethodforplantlncrnaproteininteraction
AT chenming ahybridpredictionmethodforplantlncrnaproteininteraction
AT mengjun ahybridpredictionmethodforplantlncrnaproteininteraction
AT wekesajaelsanyanda hybridpredictionmethodforplantlncrnaproteininteraction
AT luanyushi hybridpredictionmethodforplantlncrnaproteininteraction
AT chenming hybridpredictionmethodforplantlncrnaproteininteraction
AT mengjun hybridpredictionmethodforplantlncrnaproteininteraction