Cargando…

MetaScore: A Novel Machine-Learning-Based Approach to Improve Traditional Scoring Functions for Scoring Protein–Protein Docking Conformations

Protein–protein interactions play a ubiquitous role in biological function. Knowledge of the three-dimensional (3D) structures of the complexes they form is essential for understanding the structural basis of those interactions and how they orchestrate key cellular processes. Computational docking h...

Descripción completa

Detalles Bibliográficos
Autores principales: Jung, Yong, Geng, Cunliang, Bonvin, Alexandre M. J. J., Xue, Li C., Honavar, Vasant G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9855734/
https://www.ncbi.nlm.nih.gov/pubmed/36671507
http://dx.doi.org/10.3390/biom13010121
_version_ 1784873448151973888
author Jung, Yong
Geng, Cunliang
Bonvin, Alexandre M. J. J.
Xue, Li C.
Honavar, Vasant G.
author_facet Jung, Yong
Geng, Cunliang
Bonvin, Alexandre M. J. J.
Xue, Li C.
Honavar, Vasant G.
author_sort Jung, Yong
collection PubMed
description Protein–protein interactions play a ubiquitous role in biological function. Knowledge of the three-dimensional (3D) structures of the complexes they form is essential for understanding the structural basis of those interactions and how they orchestrate key cellular processes. Computational docking has become an indispensable alternative to the expensive and time-consuming experimental approaches for determining the 3D structures of protein complexes. Despite recent progress, identifying near-native models from a large set of conformations sampled by docking—the so-called scoring problem—still has considerable room for improvement. We present MetaScore, a new machine-learning-based approach to improve the scoring of docked conformations. MetaScore utilizes a random forest (RF) classifier trained to distinguish near-native from non-native conformations using their protein–protein interfacial features. The features include physicochemical properties, energy terms, interaction-propensity-based features, geometric properties, interface topology features, evolutionary conservation, and also scores produced by traditional scoring functions (SFs). MetaScore scores docked conformations by simply averaging the score produced by the RF classifier with that produced by any traditional SF. We demonstrate that (i) MetaScore consistently outperforms each of the nine traditional SFs included in this work in terms of success rate and hit rate evaluated over conformations ranked among the top 10; (ii) an ensemble method, MetaScore-Ensemble, that combines 10 variants of MetaScore obtained by combining the RF score with each of the traditional SFs outperforms each of the MetaScore variants. We conclude that the performance of traditional SFs can be improved upon by using machine learning to judiciously leverage protein–protein interfacial features and by using ensemble methods to combine multiple scoring functions.
format Online
Article
Text
id pubmed-9855734
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-98557342023-01-21 MetaScore: A Novel Machine-Learning-Based Approach to Improve Traditional Scoring Functions for Scoring Protein–Protein Docking Conformations Jung, Yong Geng, Cunliang Bonvin, Alexandre M. J. J. Xue, Li C. Honavar, Vasant G. Biomolecules Article Protein–protein interactions play a ubiquitous role in biological function. Knowledge of the three-dimensional (3D) structures of the complexes they form is essential for understanding the structural basis of those interactions and how they orchestrate key cellular processes. Computational docking has become an indispensable alternative to the expensive and time-consuming experimental approaches for determining the 3D structures of protein complexes. Despite recent progress, identifying near-native models from a large set of conformations sampled by docking—the so-called scoring problem—still has considerable room for improvement. We present MetaScore, a new machine-learning-based approach to improve the scoring of docked conformations. MetaScore utilizes a random forest (RF) classifier trained to distinguish near-native from non-native conformations using their protein–protein interfacial features. The features include physicochemical properties, energy terms, interaction-propensity-based features, geometric properties, interface topology features, evolutionary conservation, and also scores produced by traditional scoring functions (SFs). MetaScore scores docked conformations by simply averaging the score produced by the RF classifier with that produced by any traditional SF. We demonstrate that (i) MetaScore consistently outperforms each of the nine traditional SFs included in this work in terms of success rate and hit rate evaluated over conformations ranked among the top 10; (ii) an ensemble method, MetaScore-Ensemble, that combines 10 variants of MetaScore obtained by combining the RF score with each of the traditional SFs outperforms each of the MetaScore variants. We conclude that the performance of traditional SFs can be improved upon by using machine learning to judiciously leverage protein–protein interfacial features and by using ensemble methods to combine multiple scoring functions. MDPI 2023-01-06 /pmc/articles/PMC9855734/ /pubmed/36671507 http://dx.doi.org/10.3390/biom13010121 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Jung, Yong
Geng, Cunliang
Bonvin, Alexandre M. J. J.
Xue, Li C.
Honavar, Vasant G.
MetaScore: A Novel Machine-Learning-Based Approach to Improve Traditional Scoring Functions for Scoring Protein–Protein Docking Conformations
title MetaScore: A Novel Machine-Learning-Based Approach to Improve Traditional Scoring Functions for Scoring Protein–Protein Docking Conformations
title_full MetaScore: A Novel Machine-Learning-Based Approach to Improve Traditional Scoring Functions for Scoring Protein–Protein Docking Conformations
title_fullStr MetaScore: A Novel Machine-Learning-Based Approach to Improve Traditional Scoring Functions for Scoring Protein–Protein Docking Conformations
title_full_unstemmed MetaScore: A Novel Machine-Learning-Based Approach to Improve Traditional Scoring Functions for Scoring Protein–Protein Docking Conformations
title_short MetaScore: A Novel Machine-Learning-Based Approach to Improve Traditional Scoring Functions for Scoring Protein–Protein Docking Conformations
title_sort metascore: a novel machine-learning-based approach to improve traditional scoring functions for scoring protein–protein docking conformations
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9855734/
https://www.ncbi.nlm.nih.gov/pubmed/36671507
http://dx.doi.org/10.3390/biom13010121
work_keys_str_mv AT jungyong metascoreanovelmachinelearningbasedapproachtoimprovetraditionalscoringfunctionsforscoringproteinproteindockingconformations
AT gengcunliang metascoreanovelmachinelearningbasedapproachtoimprovetraditionalscoringfunctionsforscoringproteinproteindockingconformations
AT bonvinalexandremjj metascoreanovelmachinelearningbasedapproachtoimprovetraditionalscoringfunctionsforscoringproteinproteindockingconformations
AT xuelic metascoreanovelmachinelearningbasedapproachtoimprovetraditionalscoringfunctionsforscoringproteinproteindockingconformations
AT honavarvasantg metascoreanovelmachinelearningbasedapproachtoimprovetraditionalscoringfunctionsforscoringproteinproteindockingconformations