Cargando…

An information gain-based approach for evaluating protein structure models

For three decades now, knowledge-based scoring functions that operate through the “potential of mean force” (PMF) approach have continuously proven useful for studying protein structures. Although these statistical potentials are not to be confused with their physics-based counterparts of the same n...

Descripción completa

Detalles Bibliográficos
Autores principales: Postic, Guillaume, Janel, Nathalie, Tufféry, Pierre, Moroy, Gautier
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7431362/
https://www.ncbi.nlm.nih.gov/pubmed/32837711
http://dx.doi.org/10.1016/j.csbj.2020.08.013
_version_ 1783571567142240256
author Postic, Guillaume
Janel, Nathalie
Tufféry, Pierre
Moroy, Gautier
author_facet Postic, Guillaume
Janel, Nathalie
Tufféry, Pierre
Moroy, Gautier
author_sort Postic, Guillaume
collection PubMed
description For three decades now, knowledge-based scoring functions that operate through the “potential of mean force” (PMF) approach have continuously proven useful for studying protein structures. Although these statistical potentials are not to be confused with their physics-based counterparts of the same name—i.e. PMFs obtained by molecular dynamics simulations—their particular success in assessing the native-like character of protein structure predictions has lead authors to consider the computed scores as approximations of the free energy. However, this physical justification is a matter of controversy since the beginning. Alternative interpretations based on Bayes’ theorem have been proposed, but the misleading formalism that invokes the inverse Boltzmann law remains recurrent in the literature. In this article, we present a conceptually new method for ranking protein structure models by quality, which is (i) independent of any physics-based explanation and (ii) relevant to statistics and to a general definition of information gain. The theoretical development described in this study provides new insights into how statistical PMFs work, in comparison with our approach. To prove the concept, we have built interatomic distance-dependent scoring functions, based on the former and new equations, and compared their performance on an independent benchmark of 60,000 protein structures. The results demonstrate that our new formalism outperforms statistical PMFs in evaluating the quality of protein structural decoys. Therefore, this original type of score offers a possibility to improve the success of statistical PMFs in the various fields of structural biology where they are applied. The open-source code is available for download at https://gitlab.rpbs.univ-paris-diderot.fr/src/ig-score.
format Online
Article
Text
id pubmed-7431362
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-74313622020-08-18 An information gain-based approach for evaluating protein structure models Postic, Guillaume Janel, Nathalie Tufféry, Pierre Moroy, Gautier Comput Struct Biotechnol J Research Article For three decades now, knowledge-based scoring functions that operate through the “potential of mean force” (PMF) approach have continuously proven useful for studying protein structures. Although these statistical potentials are not to be confused with their physics-based counterparts of the same name—i.e. PMFs obtained by molecular dynamics simulations—their particular success in assessing the native-like character of protein structure predictions has lead authors to consider the computed scores as approximations of the free energy. However, this physical justification is a matter of controversy since the beginning. Alternative interpretations based on Bayes’ theorem have been proposed, but the misleading formalism that invokes the inverse Boltzmann law remains recurrent in the literature. In this article, we present a conceptually new method for ranking protein structure models by quality, which is (i) independent of any physics-based explanation and (ii) relevant to statistics and to a general definition of information gain. The theoretical development described in this study provides new insights into how statistical PMFs work, in comparison with our approach. To prove the concept, we have built interatomic distance-dependent scoring functions, based on the former and new equations, and compared their performance on an independent benchmark of 60,000 protein structures. The results demonstrate that our new formalism outperforms statistical PMFs in evaluating the quality of protein structural decoys. Therefore, this original type of score offers a possibility to improve the success of statistical PMFs in the various fields of structural biology where they are applied. The open-source code is available for download at https://gitlab.rpbs.univ-paris-diderot.fr/src/ig-score. Research Network of Computational and Structural Biotechnology 2020-08-18 /pmc/articles/PMC7431362/ /pubmed/32837711 http://dx.doi.org/10.1016/j.csbj.2020.08.013 Text en © 2020 The Author(s) http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Research Article
Postic, Guillaume
Janel, Nathalie
Tufféry, Pierre
Moroy, Gautier
An information gain-based approach for evaluating protein structure models
title An information gain-based approach for evaluating protein structure models
title_full An information gain-based approach for evaluating protein structure models
title_fullStr An information gain-based approach for evaluating protein structure models
title_full_unstemmed An information gain-based approach for evaluating protein structure models
title_short An information gain-based approach for evaluating protein structure models
title_sort information gain-based approach for evaluating protein structure models
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7431362/
https://www.ncbi.nlm.nih.gov/pubmed/32837711
http://dx.doi.org/10.1016/j.csbj.2020.08.013
work_keys_str_mv AT posticguillaume aninformationgainbasedapproachforevaluatingproteinstructuremodels
AT janelnathalie aninformationgainbasedapproachforevaluatingproteinstructuremodels
AT tufferypierre aninformationgainbasedapproachforevaluatingproteinstructuremodels
AT moroygautier aninformationgainbasedapproachforevaluatingproteinstructuremodels
AT posticguillaume informationgainbasedapproachforevaluatingproteinstructuremodels
AT janelnathalie informationgainbasedapproachforevaluatingproteinstructuremodels
AT tufferypierre informationgainbasedapproachforevaluatingproteinstructuremodels
AT moroygautier informationgainbasedapproachforevaluatingproteinstructuremodels