Cargando…

XGBPRH: Prediction of Binding Hot Spots at Protein–RNA Interfaces Utilizing Extreme Gradient Boosting

Hot spot residues at protein–RNA complexes are vitally important for investigating the underlying molecular recognition mechanism. Accurately identifying protein–RNA binding hot spots is critical for drug designing and protein engineering. Although some progress has been made by utilizing various av...

Descripción completa

Detalles Bibliográficos
Autores principales: Deng, Lei, Sui, Yuanchao, Zhang, Jingpu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6471955/
https://www.ncbi.nlm.nih.gov/pubmed/30901953
http://dx.doi.org/10.3390/genes10030242
_version_ 1783412144044244992
author Deng, Lei
Sui, Yuanchao
Zhang, Jingpu
author_facet Deng, Lei
Sui, Yuanchao
Zhang, Jingpu
author_sort Deng, Lei
collection PubMed
description Hot spot residues at protein–RNA complexes are vitally important for investigating the underlying molecular recognition mechanism. Accurately identifying protein–RNA binding hot spots is critical for drug designing and protein engineering. Although some progress has been made by utilizing various available features and a series of machine learning approaches, these methods are still in the infant stage. In this paper, we present a new computational method named XGBPRH, which is based on an eXtreme Gradient Boosting (XGBoost) algorithm and can effectively predict hot spot residues in protein–RNA interfaces utilizing an optimal set of properties. Firstly, we download 47 protein–RNA complexes and calculate a total of 156 sequence, structure, exposure, and network features. Next, we adopt a two-step feature selection algorithm to extract a combination of 6 optimal features from the combination of these 156 features. Compared with the state-of-the-art approaches, XGBPRH achieves better performances with an area under the ROC curve (AUC) score of 0.817 and an F1-score of 0.802 on the independent test set. Meanwhile, we also apply XGBPRH to two case studies. The results demonstrate that the method can effectively identify novel energy hotspots.
format Online
Article
Text
id pubmed-6471955
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-64719552019-04-27 XGBPRH: Prediction of Binding Hot Spots at Protein–RNA Interfaces Utilizing Extreme Gradient Boosting Deng, Lei Sui, Yuanchao Zhang, Jingpu Genes (Basel) Article Hot spot residues at protein–RNA complexes are vitally important for investigating the underlying molecular recognition mechanism. Accurately identifying protein–RNA binding hot spots is critical for drug designing and protein engineering. Although some progress has been made by utilizing various available features and a series of machine learning approaches, these methods are still in the infant stage. In this paper, we present a new computational method named XGBPRH, which is based on an eXtreme Gradient Boosting (XGBoost) algorithm and can effectively predict hot spot residues in protein–RNA interfaces utilizing an optimal set of properties. Firstly, we download 47 protein–RNA complexes and calculate a total of 156 sequence, structure, exposure, and network features. Next, we adopt a two-step feature selection algorithm to extract a combination of 6 optimal features from the combination of these 156 features. Compared with the state-of-the-art approaches, XGBPRH achieves better performances with an area under the ROC curve (AUC) score of 0.817 and an F1-score of 0.802 on the independent test set. Meanwhile, we also apply XGBPRH to two case studies. The results demonstrate that the method can effectively identify novel energy hotspots. MDPI 2019-03-21 /pmc/articles/PMC6471955/ /pubmed/30901953 http://dx.doi.org/10.3390/genes10030242 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Deng, Lei
Sui, Yuanchao
Zhang, Jingpu
XGBPRH: Prediction of Binding Hot Spots at Protein–RNA Interfaces Utilizing Extreme Gradient Boosting
title XGBPRH: Prediction of Binding Hot Spots at Protein–RNA Interfaces Utilizing Extreme Gradient Boosting
title_full XGBPRH: Prediction of Binding Hot Spots at Protein–RNA Interfaces Utilizing Extreme Gradient Boosting
title_fullStr XGBPRH: Prediction of Binding Hot Spots at Protein–RNA Interfaces Utilizing Extreme Gradient Boosting
title_full_unstemmed XGBPRH: Prediction of Binding Hot Spots at Protein–RNA Interfaces Utilizing Extreme Gradient Boosting
title_short XGBPRH: Prediction of Binding Hot Spots at Protein–RNA Interfaces Utilizing Extreme Gradient Boosting
title_sort xgbprh: prediction of binding hot spots at protein–rna interfaces utilizing extreme gradient boosting
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6471955/
https://www.ncbi.nlm.nih.gov/pubmed/30901953
http://dx.doi.org/10.3390/genes10030242
work_keys_str_mv AT denglei xgbprhpredictionofbindinghotspotsatproteinrnainterfacesutilizingextremegradientboosting
AT suiyuanchao xgbprhpredictionofbindinghotspotsatproteinrnainterfacesutilizingextremegradientboosting
AT zhangjingpu xgbprhpredictionofbindinghotspotsatproteinrnainterfacesutilizingextremegradientboosting