Cargando…
On the Importance of the Distance Measures Used to Train and Test Knowledge-Based Potentials for Proteins
Knowledge-based potentials are energy functions derived from the analysis of databases of protein structures and sequences. They can be divided into two classes. Potentials from the first class are based on a direct conversion of the distributions of some geometric properties observed in native prot...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4239004/ https://www.ncbi.nlm.nih.gov/pubmed/25411785 http://dx.doi.org/10.1371/journal.pone.0109335 |
_version_ | 1782345542760660992 |
---|---|
author | Carlsen, Martin Koehl, Patrice Røgen, Peter |
author_facet | Carlsen, Martin Koehl, Patrice Røgen, Peter |
author_sort | Carlsen, Martin |
collection | PubMed |
description | Knowledge-based potentials are energy functions derived from the analysis of databases of protein structures and sequences. They can be divided into two classes. Potentials from the first class are based on a direct conversion of the distributions of some geometric properties observed in native protein structures into energy values, while potentials from the second class are trained to mimic quantitatively the geometric differences between incorrectly folded models and native structures. In this paper, we focus on the relationship between energy and geometry when training the second class of knowledge-based potentials. We assume that the difference in energy between a decoy structure and the corresponding native structure is linearly related to the distance between the two structures. We trained two distance-based knowledge-based potentials accordingly, one based on all inter-residue distances (PPD), while the other had the set of all distances filtered to reflect consistency in an ensemble of decoys (PPE). We tested four types of metric to characterize the distance between the decoy and the native structure, two based on extrinsic geometry (RMSD and GTD-TS*), and two based on intrinsic geometry (Q* and MT). The corresponding eight potentials were tested on a large collection of decoy sets. We found that it is usually better to train a potential using an intrinsic distance measure. We also found that PPE outperforms PPD, emphasizing the benefits of capturing consistent information in an ensemble. The relevance of these results for the design of knowledge-based potentials is discussed. |
format | Online Article Text |
id | pubmed-4239004 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-42390042014-11-26 On the Importance of the Distance Measures Used to Train and Test Knowledge-Based Potentials for Proteins Carlsen, Martin Koehl, Patrice Røgen, Peter PLoS One Research Article Knowledge-based potentials are energy functions derived from the analysis of databases of protein structures and sequences. They can be divided into two classes. Potentials from the first class are based on a direct conversion of the distributions of some geometric properties observed in native protein structures into energy values, while potentials from the second class are trained to mimic quantitatively the geometric differences between incorrectly folded models and native structures. In this paper, we focus on the relationship between energy and geometry when training the second class of knowledge-based potentials. We assume that the difference in energy between a decoy structure and the corresponding native structure is linearly related to the distance between the two structures. We trained two distance-based knowledge-based potentials accordingly, one based on all inter-residue distances (PPD), while the other had the set of all distances filtered to reflect consistency in an ensemble of decoys (PPE). We tested four types of metric to characterize the distance between the decoy and the native structure, two based on extrinsic geometry (RMSD and GTD-TS*), and two based on intrinsic geometry (Q* and MT). The corresponding eight potentials were tested on a large collection of decoy sets. We found that it is usually better to train a potential using an intrinsic distance measure. We also found that PPE outperforms PPD, emphasizing the benefits of capturing consistent information in an ensemble. The relevance of these results for the design of knowledge-based potentials is discussed. Public Library of Science 2014-11-20 /pmc/articles/PMC4239004/ /pubmed/25411785 http://dx.doi.org/10.1371/journal.pone.0109335 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration, which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. |
spellingShingle | Research Article Carlsen, Martin Koehl, Patrice Røgen, Peter On the Importance of the Distance Measures Used to Train and Test Knowledge-Based Potentials for Proteins |
title | On the Importance of the Distance Measures Used to Train and Test Knowledge-Based Potentials for Proteins |
title_full | On the Importance of the Distance Measures Used to Train and Test Knowledge-Based Potentials for Proteins |
title_fullStr | On the Importance of the Distance Measures Used to Train and Test Knowledge-Based Potentials for Proteins |
title_full_unstemmed | On the Importance of the Distance Measures Used to Train and Test Knowledge-Based Potentials for Proteins |
title_short | On the Importance of the Distance Measures Used to Train and Test Knowledge-Based Potentials for Proteins |
title_sort | on the importance of the distance measures used to train and test knowledge-based potentials for proteins |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4239004/ https://www.ncbi.nlm.nih.gov/pubmed/25411785 http://dx.doi.org/10.1371/journal.pone.0109335 |
work_keys_str_mv | AT carlsenmartin ontheimportanceofthedistancemeasuresusedtotrainandtestknowledgebasedpotentialsforproteins AT koehlpatrice ontheimportanceofthedistancemeasuresusedtotrainandtestknowledgebasedpotentialsforproteins AT røgenpeter ontheimportanceofthedistancemeasuresusedtotrainandtestknowledgebasedpotentialsforproteins |