Cargando…
PredPRBA: Prediction of Protein-RNA Binding Affinity Using Gradient Boosted Regression Trees
Protein-RNA interactions play essential roles in many biological aspects. Quantifying the binding affinity of protein-RNA complexes is helpful to the understanding of protein-RNA recognition mechanisms and identification of strong binding partners. Due to experimentally measured protein-RNA binding...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6688581/ https://www.ncbi.nlm.nih.gov/pubmed/31428122 http://dx.doi.org/10.3389/fgene.2019.00637 |
_version_ | 1783442913565343744 |
---|---|
author | Deng, Lei Yang, Wenyi Liu, Hui |
author_facet | Deng, Lei Yang, Wenyi Liu, Hui |
author_sort | Deng, Lei |
collection | PubMed |
description | Protein-RNA interactions play essential roles in many biological aspects. Quantifying the binding affinity of protein-RNA complexes is helpful to the understanding of protein-RNA recognition mechanisms and identification of strong binding partners. Due to experimentally measured protein-RNA binding affinity data available is still limited to date, there is a pressing demand for accurate and reliable computational approaches. In this paper, we propose a computational approach, PredPRBA, which can effectively predict protein-RNA binding affinity using gradient boosted regression trees. We build a dataset of protein-RNA binding affinity that includes 103 protein-RNA complex structures manually collected from related literature. Then, we generate 37 kinds of sequence and structural features and explore the relationship between the features and protein-RNA binding affinity. We find that the binding affinity mainly depends on the structure of RNA molecules. According to the type of RNA associated with proteins composed of the protein-RNA complex, we split the 103 protein-RNA complexes into six categories. For each category, we build a gradient boosted regression tree (GBRT) model based on the generated features. We perform a comprehensive evaluation for the proposed method on the binding affinity dataset using leave-one-out cross-validation. We show that PredPRBA achieves correlations ranging from 0.723 to 0.897 among six categories, which is significantly better than other typical regression methods and the pioneer protein-RNA binding affinity predictor SPOT-Seq-RNA. In addition, a user-friendly web server has been developed to predict the binding affinity of protein-RNA complexes. The PredPRBA webserver is freely available at http://PredPRBA.denglab.org/. |
format | Online Article Text |
id | pubmed-6688581 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-66885812019-08-19 PredPRBA: Prediction of Protein-RNA Binding Affinity Using Gradient Boosted Regression Trees Deng, Lei Yang, Wenyi Liu, Hui Front Genet Genetics Protein-RNA interactions play essential roles in many biological aspects. Quantifying the binding affinity of protein-RNA complexes is helpful to the understanding of protein-RNA recognition mechanisms and identification of strong binding partners. Due to experimentally measured protein-RNA binding affinity data available is still limited to date, there is a pressing demand for accurate and reliable computational approaches. In this paper, we propose a computational approach, PredPRBA, which can effectively predict protein-RNA binding affinity using gradient boosted regression trees. We build a dataset of protein-RNA binding affinity that includes 103 protein-RNA complex structures manually collected from related literature. Then, we generate 37 kinds of sequence and structural features and explore the relationship between the features and protein-RNA binding affinity. We find that the binding affinity mainly depends on the structure of RNA molecules. According to the type of RNA associated with proteins composed of the protein-RNA complex, we split the 103 protein-RNA complexes into six categories. For each category, we build a gradient boosted regression tree (GBRT) model based on the generated features. We perform a comprehensive evaluation for the proposed method on the binding affinity dataset using leave-one-out cross-validation. We show that PredPRBA achieves correlations ranging from 0.723 to 0.897 among six categories, which is significantly better than other typical regression methods and the pioneer protein-RNA binding affinity predictor SPOT-Seq-RNA. In addition, a user-friendly web server has been developed to predict the binding affinity of protein-RNA complexes. The PredPRBA webserver is freely available at http://PredPRBA.denglab.org/. Frontiers Media S.A. 2019-08-02 /pmc/articles/PMC6688581/ /pubmed/31428122 http://dx.doi.org/10.3389/fgene.2019.00637 Text en Copyright © 2019 Deng, Yang and Liu http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Deng, Lei Yang, Wenyi Liu, Hui PredPRBA: Prediction of Protein-RNA Binding Affinity Using Gradient Boosted Regression Trees |
title | PredPRBA: Prediction of Protein-RNA Binding Affinity Using Gradient Boosted Regression Trees |
title_full | PredPRBA: Prediction of Protein-RNA Binding Affinity Using Gradient Boosted Regression Trees |
title_fullStr | PredPRBA: Prediction of Protein-RNA Binding Affinity Using Gradient Boosted Regression Trees |
title_full_unstemmed | PredPRBA: Prediction of Protein-RNA Binding Affinity Using Gradient Boosted Regression Trees |
title_short | PredPRBA: Prediction of Protein-RNA Binding Affinity Using Gradient Boosted Regression Trees |
title_sort | predprba: prediction of protein-rna binding affinity using gradient boosted regression trees |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6688581/ https://www.ncbi.nlm.nih.gov/pubmed/31428122 http://dx.doi.org/10.3389/fgene.2019.00637 |
work_keys_str_mv | AT denglei predprbapredictionofproteinrnabindingaffinityusinggradientboostedregressiontrees AT yangwenyi predprbapredictionofproteinrnabindingaffinityusinggradientboostedregressiontrees AT liuhui predprbapredictionofproteinrnabindingaffinityusinggradientboostedregressiontrees |