Cargando…

PredPRBA: Prediction of Protein-RNA Binding Affinity Using Gradient Boosted Regression Trees

Protein-RNA interactions play essential roles in many biological aspects. Quantifying the binding affinity of protein-RNA complexes is helpful to the understanding of protein-RNA recognition mechanisms and identification of strong binding partners. Due to experimentally measured protein-RNA binding...

Descripción completa

Detalles Bibliográficos
Autores principales: Deng, Lei, Yang, Wenyi, Liu, Hui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6688581/
https://www.ncbi.nlm.nih.gov/pubmed/31428122
http://dx.doi.org/10.3389/fgene.2019.00637
_version_ 1783442913565343744
author Deng, Lei
Yang, Wenyi
Liu, Hui
author_facet Deng, Lei
Yang, Wenyi
Liu, Hui
author_sort Deng, Lei
collection PubMed
description Protein-RNA interactions play essential roles in many biological aspects. Quantifying the binding affinity of protein-RNA complexes is helpful to the understanding of protein-RNA recognition mechanisms and identification of strong binding partners. Due to experimentally measured protein-RNA binding affinity data available is still limited to date, there is a pressing demand for accurate and reliable computational approaches. In this paper, we propose a computational approach, PredPRBA, which can effectively predict protein-RNA binding affinity using gradient boosted regression trees. We build a dataset of protein-RNA binding affinity that includes 103 protein-RNA complex structures manually collected from related literature. Then, we generate 37 kinds of sequence and structural features and explore the relationship between the features and protein-RNA binding affinity. We find that the binding affinity mainly depends on the structure of RNA molecules. According to the type of RNA associated with proteins composed of the protein-RNA complex, we split the 103 protein-RNA complexes into six categories. For each category, we build a gradient boosted regression tree (GBRT) model based on the generated features. We perform a comprehensive evaluation for the proposed method on the binding affinity dataset using leave-one-out cross-validation. We show that PredPRBA achieves correlations ranging from 0.723 to 0.897 among six categories, which is significantly better than other typical regression methods and the pioneer protein-RNA binding affinity predictor SPOT-Seq-RNA. In addition, a user-friendly web server has been developed to predict the binding affinity of protein-RNA complexes. The PredPRBA webserver is freely available at http://PredPRBA.denglab.org/.
format Online
Article
Text
id pubmed-6688581
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-66885812019-08-19 PredPRBA: Prediction of Protein-RNA Binding Affinity Using Gradient Boosted Regression Trees Deng, Lei Yang, Wenyi Liu, Hui Front Genet Genetics Protein-RNA interactions play essential roles in many biological aspects. Quantifying the binding affinity of protein-RNA complexes is helpful to the understanding of protein-RNA recognition mechanisms and identification of strong binding partners. Due to experimentally measured protein-RNA binding affinity data available is still limited to date, there is a pressing demand for accurate and reliable computational approaches. In this paper, we propose a computational approach, PredPRBA, which can effectively predict protein-RNA binding affinity using gradient boosted regression trees. We build a dataset of protein-RNA binding affinity that includes 103 protein-RNA complex structures manually collected from related literature. Then, we generate 37 kinds of sequence and structural features and explore the relationship between the features and protein-RNA binding affinity. We find that the binding affinity mainly depends on the structure of RNA molecules. According to the type of RNA associated with proteins composed of the protein-RNA complex, we split the 103 protein-RNA complexes into six categories. For each category, we build a gradient boosted regression tree (GBRT) model based on the generated features. We perform a comprehensive evaluation for the proposed method on the binding affinity dataset using leave-one-out cross-validation. We show that PredPRBA achieves correlations ranging from 0.723 to 0.897 among six categories, which is significantly better than other typical regression methods and the pioneer protein-RNA binding affinity predictor SPOT-Seq-RNA. In addition, a user-friendly web server has been developed to predict the binding affinity of protein-RNA complexes. The PredPRBA webserver is freely available at http://PredPRBA.denglab.org/. Frontiers Media S.A. 2019-08-02 /pmc/articles/PMC6688581/ /pubmed/31428122 http://dx.doi.org/10.3389/fgene.2019.00637 Text en Copyright © 2019 Deng, Yang and Liu http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Deng, Lei
Yang, Wenyi
Liu, Hui
PredPRBA: Prediction of Protein-RNA Binding Affinity Using Gradient Boosted Regression Trees
title PredPRBA: Prediction of Protein-RNA Binding Affinity Using Gradient Boosted Regression Trees
title_full PredPRBA: Prediction of Protein-RNA Binding Affinity Using Gradient Boosted Regression Trees
title_fullStr PredPRBA: Prediction of Protein-RNA Binding Affinity Using Gradient Boosted Regression Trees
title_full_unstemmed PredPRBA: Prediction of Protein-RNA Binding Affinity Using Gradient Boosted Regression Trees
title_short PredPRBA: Prediction of Protein-RNA Binding Affinity Using Gradient Boosted Regression Trees
title_sort predprba: prediction of protein-rna binding affinity using gradient boosted regression trees
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6688581/
https://www.ncbi.nlm.nih.gov/pubmed/31428122
http://dx.doi.org/10.3389/fgene.2019.00637
work_keys_str_mv AT denglei predprbapredictionofproteinrnabindingaffinityusinggradientboostedregressiontrees
AT yangwenyi predprbapredictionofproteinrnabindingaffinityusinggradientboostedregressiontrees
AT liuhui predprbapredictionofproteinrnabindingaffinityusinggradientboostedregressiontrees