Cargando…
A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine
RNA protein interactions (RPI) play a pivotal role in the regulation of various biological processes. Experimental validation of RPI has been time-consuming, paving the way for computational prediction methods. The major limiting factor of these methods has been the accuracy and confidence of the pr...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6015049/ https://www.ncbi.nlm.nih.gov/pubmed/29934510 http://dx.doi.org/10.1038/s41598-018-27814-2 |
_version_ | 1783334316070141952 |
---|---|
author | Jain, Dharm Skandh Gupte, Sanket Rajan Aduri, Raviprasad |
author_facet | Jain, Dharm Skandh Gupte, Sanket Rajan Aduri, Raviprasad |
author_sort | Jain, Dharm Skandh |
collection | PubMed |
description | RNA protein interactions (RPI) play a pivotal role in the regulation of various biological processes. Experimental validation of RPI has been time-consuming, paving the way for computational prediction methods. The major limiting factor of these methods has been the accuracy and confidence of the predictions, and our in-house experiments show that they fail to accurately predict RPI involving short RNA sequences such as TERRA RNA. Here, we present a data-driven model for RPI prediction using a gradient boosting classifier. Amino acids and nucleotides are classified based on the high-resolution structural data of RNA protein complexes. The minimum structural unit consisting of five residues is used as the descriptor. Comparative analysis of existing methods shows the consistently higher performance of our method irrespective of the length of RNA present in the RPI. The method has been successfully applied to map RPI networks involving both long noncoding RNA as well as TERRA RNA. The method is also shown to successfully predict RNA and protein hubs present in RPI networks of four different organisms. The robustness of this method will provide a way for predicting RPI networks of yet unknown interactions for both long noncoding RNA and microRNA. |
format | Online Article Text |
id | pubmed-6015049 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-60150492018-07-06 A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine Jain, Dharm Skandh Gupte, Sanket Rajan Aduri, Raviprasad Sci Rep Article RNA protein interactions (RPI) play a pivotal role in the regulation of various biological processes. Experimental validation of RPI has been time-consuming, paving the way for computational prediction methods. The major limiting factor of these methods has been the accuracy and confidence of the predictions, and our in-house experiments show that they fail to accurately predict RPI involving short RNA sequences such as TERRA RNA. Here, we present a data-driven model for RPI prediction using a gradient boosting classifier. Amino acids and nucleotides are classified based on the high-resolution structural data of RNA protein complexes. The minimum structural unit consisting of five residues is used as the descriptor. Comparative analysis of existing methods shows the consistently higher performance of our method irrespective of the length of RNA present in the RPI. The method has been successfully applied to map RPI networks involving both long noncoding RNA as well as TERRA RNA. The method is also shown to successfully predict RNA and protein hubs present in RPI networks of four different organisms. The robustness of this method will provide a way for predicting RPI networks of yet unknown interactions for both long noncoding RNA and microRNA. Nature Publishing Group UK 2018-06-22 /pmc/articles/PMC6015049/ /pubmed/29934510 http://dx.doi.org/10.1038/s41598-018-27814-2 Text en © The Author(s) 2018 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Jain, Dharm Skandh Gupte, Sanket Rajan Aduri, Raviprasad A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine |
title | A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine |
title_full | A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine |
title_fullStr | A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine |
title_full_unstemmed | A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine |
title_short | A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine |
title_sort | data driven model for predicting rna-protein interactions based on gradient boosting machine |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6015049/ https://www.ncbi.nlm.nih.gov/pubmed/29934510 http://dx.doi.org/10.1038/s41598-018-27814-2 |
work_keys_str_mv | AT jaindharmskandh adatadrivenmodelforpredictingrnaproteininteractionsbasedongradientboostingmachine AT guptesanketrajan adatadrivenmodelforpredictingrnaproteininteractionsbasedongradientboostingmachine AT aduriraviprasad adatadrivenmodelforpredictingrnaproteininteractionsbasedongradientboostingmachine AT jaindharmskandh datadrivenmodelforpredictingrnaproteininteractionsbasedongradientboostingmachine AT guptesanketrajan datadrivenmodelforpredictingrnaproteininteractionsbasedongradientboostingmachine AT aduriraviprasad datadrivenmodelforpredictingrnaproteininteractionsbasedongradientboostingmachine |