Cargando…

A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine

RNA protein interactions (RPI) play a pivotal role in the regulation of various biological processes. Experimental validation of RPI has been time-consuming, paving the way for computational prediction methods. The major limiting factor of these methods has been the accuracy and confidence of the pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Jain, Dharm Skandh, Gupte, Sanket Rajan, Aduri, Raviprasad
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6015049/
https://www.ncbi.nlm.nih.gov/pubmed/29934510
http://dx.doi.org/10.1038/s41598-018-27814-2
_version_ 1783334316070141952
author Jain, Dharm Skandh
Gupte, Sanket Rajan
Aduri, Raviprasad
author_facet Jain, Dharm Skandh
Gupte, Sanket Rajan
Aduri, Raviprasad
author_sort Jain, Dharm Skandh
collection PubMed
description RNA protein interactions (RPI) play a pivotal role in the regulation of various biological processes. Experimental validation of RPI has been time-consuming, paving the way for computational prediction methods. The major limiting factor of these methods has been the accuracy and confidence of the predictions, and our in-house experiments show that they fail to accurately predict RPI involving short RNA sequences such as TERRA RNA. Here, we present a data-driven model for RPI prediction using a gradient boosting classifier. Amino acids and nucleotides are classified based on the high-resolution structural data of RNA protein complexes. The minimum structural unit consisting of five residues is used as the descriptor. Comparative analysis of existing methods shows the consistently higher performance of our method irrespective of the length of RNA present in the RPI. The method has been successfully applied to map RPI networks involving both long noncoding RNA as well as TERRA RNA. The method is also shown to successfully predict RNA and protein hubs present in RPI networks of four different organisms. The robustness of this method will provide a way for predicting RPI networks of yet unknown interactions for both long noncoding RNA and microRNA.
format Online
Article
Text
id pubmed-6015049
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-60150492018-07-06 A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine Jain, Dharm Skandh Gupte, Sanket Rajan Aduri, Raviprasad Sci Rep Article RNA protein interactions (RPI) play a pivotal role in the regulation of various biological processes. Experimental validation of RPI has been time-consuming, paving the way for computational prediction methods. The major limiting factor of these methods has been the accuracy and confidence of the predictions, and our in-house experiments show that they fail to accurately predict RPI involving short RNA sequences such as TERRA RNA. Here, we present a data-driven model for RPI prediction using a gradient boosting classifier. Amino acids and nucleotides are classified based on the high-resolution structural data of RNA protein complexes. The minimum structural unit consisting of five residues is used as the descriptor. Comparative analysis of existing methods shows the consistently higher performance of our method irrespective of the length of RNA present in the RPI. The method has been successfully applied to map RPI networks involving both long noncoding RNA as well as TERRA RNA. The method is also shown to successfully predict RNA and protein hubs present in RPI networks of four different organisms. The robustness of this method will provide a way for predicting RPI networks of yet unknown interactions for both long noncoding RNA and microRNA. Nature Publishing Group UK 2018-06-22 /pmc/articles/PMC6015049/ /pubmed/29934510 http://dx.doi.org/10.1038/s41598-018-27814-2 Text en © The Author(s) 2018 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Jain, Dharm Skandh
Gupte, Sanket Rajan
Aduri, Raviprasad
A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine
title A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine
title_full A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine
title_fullStr A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine
title_full_unstemmed A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine
title_short A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine
title_sort data driven model for predicting rna-protein interactions based on gradient boosting machine
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6015049/
https://www.ncbi.nlm.nih.gov/pubmed/29934510
http://dx.doi.org/10.1038/s41598-018-27814-2
work_keys_str_mv AT jaindharmskandh adatadrivenmodelforpredictingrnaproteininteractionsbasedongradientboostingmachine
AT guptesanketrajan adatadrivenmodelforpredictingrnaproteininteractionsbasedongradientboostingmachine
AT aduriraviprasad adatadrivenmodelforpredictingrnaproteininteractionsbasedongradientboostingmachine
AT jaindharmskandh datadrivenmodelforpredictingrnaproteininteractionsbasedongradientboostingmachine
AT guptesanketrajan datadrivenmodelforpredictingrnaproteininteractionsbasedongradientboostingmachine
AT aduriraviprasad datadrivenmodelforpredictingrnaproteininteractionsbasedongradientboostingmachine