Cargando…
XGBPRH: Prediction of Binding Hot Spots at Protein–RNA Interfaces Utilizing Extreme Gradient Boosting
Hot spot residues at protein–RNA complexes are vitally important for investigating the underlying molecular recognition mechanism. Accurately identifying protein–RNA binding hot spots is critical for drug designing and protein engineering. Although some progress has been made by utilizing various av...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6471955/ https://www.ncbi.nlm.nih.gov/pubmed/30901953 http://dx.doi.org/10.3390/genes10030242 |
_version_ | 1783412144044244992 |
---|---|
author | Deng, Lei Sui, Yuanchao Zhang, Jingpu |
author_facet | Deng, Lei Sui, Yuanchao Zhang, Jingpu |
author_sort | Deng, Lei |
collection | PubMed |
description | Hot spot residues at protein–RNA complexes are vitally important for investigating the underlying molecular recognition mechanism. Accurately identifying protein–RNA binding hot spots is critical for drug designing and protein engineering. Although some progress has been made by utilizing various available features and a series of machine learning approaches, these methods are still in the infant stage. In this paper, we present a new computational method named XGBPRH, which is based on an eXtreme Gradient Boosting (XGBoost) algorithm and can effectively predict hot spot residues in protein–RNA interfaces utilizing an optimal set of properties. Firstly, we download 47 protein–RNA complexes and calculate a total of 156 sequence, structure, exposure, and network features. Next, we adopt a two-step feature selection algorithm to extract a combination of 6 optimal features from the combination of these 156 features. Compared with the state-of-the-art approaches, XGBPRH achieves better performances with an area under the ROC curve (AUC) score of 0.817 and an F1-score of 0.802 on the independent test set. Meanwhile, we also apply XGBPRH to two case studies. The results demonstrate that the method can effectively identify novel energy hotspots. |
format | Online Article Text |
id | pubmed-6471955 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-64719552019-04-27 XGBPRH: Prediction of Binding Hot Spots at Protein–RNA Interfaces Utilizing Extreme Gradient Boosting Deng, Lei Sui, Yuanchao Zhang, Jingpu Genes (Basel) Article Hot spot residues at protein–RNA complexes are vitally important for investigating the underlying molecular recognition mechanism. Accurately identifying protein–RNA binding hot spots is critical for drug designing and protein engineering. Although some progress has been made by utilizing various available features and a series of machine learning approaches, these methods are still in the infant stage. In this paper, we present a new computational method named XGBPRH, which is based on an eXtreme Gradient Boosting (XGBoost) algorithm and can effectively predict hot spot residues in protein–RNA interfaces utilizing an optimal set of properties. Firstly, we download 47 protein–RNA complexes and calculate a total of 156 sequence, structure, exposure, and network features. Next, we adopt a two-step feature selection algorithm to extract a combination of 6 optimal features from the combination of these 156 features. Compared with the state-of-the-art approaches, XGBPRH achieves better performances with an area under the ROC curve (AUC) score of 0.817 and an F1-score of 0.802 on the independent test set. Meanwhile, we also apply XGBPRH to two case studies. The results demonstrate that the method can effectively identify novel energy hotspots. MDPI 2019-03-21 /pmc/articles/PMC6471955/ /pubmed/30901953 http://dx.doi.org/10.3390/genes10030242 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Deng, Lei Sui, Yuanchao Zhang, Jingpu XGBPRH: Prediction of Binding Hot Spots at Protein–RNA Interfaces Utilizing Extreme Gradient Boosting |
title | XGBPRH: Prediction of Binding Hot Spots at Protein–RNA Interfaces Utilizing Extreme Gradient Boosting |
title_full | XGBPRH: Prediction of Binding Hot Spots at Protein–RNA Interfaces Utilizing Extreme Gradient Boosting |
title_fullStr | XGBPRH: Prediction of Binding Hot Spots at Protein–RNA Interfaces Utilizing Extreme Gradient Boosting |
title_full_unstemmed | XGBPRH: Prediction of Binding Hot Spots at Protein–RNA Interfaces Utilizing Extreme Gradient Boosting |
title_short | XGBPRH: Prediction of Binding Hot Spots at Protein–RNA Interfaces Utilizing Extreme Gradient Boosting |
title_sort | xgbprh: prediction of binding hot spots at protein–rna interfaces utilizing extreme gradient boosting |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6471955/ https://www.ncbi.nlm.nih.gov/pubmed/30901953 http://dx.doi.org/10.3390/genes10030242 |
work_keys_str_mv | AT denglei xgbprhpredictionofbindinghotspotsatproteinrnainterfacesutilizingextremegradientboosting AT suiyuanchao xgbprhpredictionofbindinghotspotsatproteinrnainterfacesutilizingextremegradientboosting AT zhangjingpu xgbprhpredictionofbindinghotspotsatproteinrnainterfacesutilizingextremegradientboosting |