Cargando…

Representations of protein structure for exploring the conformational space: A speed–accuracy trade-off

The recent breakthrough in the field of protein structure prediction shows the relevance of using knowledge-based based scoring functions in combination with a low-resolution 3D representation of protein macromolecules. The choice of not using all atoms is barely supported by any data in the literat...

Descripción completa

Detalles Bibliográficos
Autores principales: Postic, Guillaume, Janel, Nathalie, Moroy, Gautier
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8120936/
https://www.ncbi.nlm.nih.gov/pubmed/34025948
http://dx.doi.org/10.1016/j.csbj.2021.04.049
_version_ 1783692214841376768
author Postic, Guillaume
Janel, Nathalie
Moroy, Gautier
author_facet Postic, Guillaume
Janel, Nathalie
Moroy, Gautier
author_sort Postic, Guillaume
collection PubMed
description The recent breakthrough in the field of protein structure prediction shows the relevance of using knowledge-based based scoring functions in combination with a low-resolution 3D representation of protein macromolecules. The choice of not using all atoms is barely supported by any data in the literature, and is mostly motivated by empirical and practical reasons, such as the computational cost of assessing the numerous folds of the protein conformational space. Here, we present a comprehensive study, carried on a large and balanced benchmark of predicted protein structures, to see how different types of structural representations rank in either accuracy or calculation speed, and which ones offer the best compromise between these two criteria. We tested ten representations, including low-resolution, high-resolution, and coarse-grained approaches. We also investigated the generalization of the findings to other formalisms than the widely-used “potential of mean force” (PMF) method. Thus, we observed that representing protein structures by their β carbons—combined or not with Cα—provides the best speed–accuracy trade-off, when using a “total information gain” scoring function. For statistical PMFs, using MARTINI backbone and side-chains beads is the best option. Finally, we also demonstrated the necessity of training the reference state on all atom types, and of including the Cα atoms of glycine residues, in a Cβ-based representation.
format Online
Article
Text
id pubmed-8120936
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-81209362021-05-21 Representations of protein structure for exploring the conformational space: A speed–accuracy trade-off Postic, Guillaume Janel, Nathalie Moroy, Gautier Comput Struct Biotechnol J Research Article The recent breakthrough in the field of protein structure prediction shows the relevance of using knowledge-based based scoring functions in combination with a low-resolution 3D representation of protein macromolecules. The choice of not using all atoms is barely supported by any data in the literature, and is mostly motivated by empirical and practical reasons, such as the computational cost of assessing the numerous folds of the protein conformational space. Here, we present a comprehensive study, carried on a large and balanced benchmark of predicted protein structures, to see how different types of structural representations rank in either accuracy or calculation speed, and which ones offer the best compromise between these two criteria. We tested ten representations, including low-resolution, high-resolution, and coarse-grained approaches. We also investigated the generalization of the findings to other formalisms than the widely-used “potential of mean force” (PMF) method. Thus, we observed that representing protein structures by their β carbons—combined or not with Cα—provides the best speed–accuracy trade-off, when using a “total information gain” scoring function. For statistical PMFs, using MARTINI backbone and side-chains beads is the best option. Finally, we also demonstrated the necessity of training the reference state on all atom types, and of including the Cα atoms of glycine residues, in a Cβ-based representation. Research Network of Computational and Structural Biotechnology 2021-04-28 /pmc/articles/PMC8120936/ /pubmed/34025948 http://dx.doi.org/10.1016/j.csbj.2021.04.049 Text en © 2021 The Authors. Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology. https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Research Article
Postic, Guillaume
Janel, Nathalie
Moroy, Gautier
Representations of protein structure for exploring the conformational space: A speed–accuracy trade-off
title Representations of protein structure for exploring the conformational space: A speed–accuracy trade-off
title_full Representations of protein structure for exploring the conformational space: A speed–accuracy trade-off
title_fullStr Representations of protein structure for exploring the conformational space: A speed–accuracy trade-off
title_full_unstemmed Representations of protein structure for exploring the conformational space: A speed–accuracy trade-off
title_short Representations of protein structure for exploring the conformational space: A speed–accuracy trade-off
title_sort representations of protein structure for exploring the conformational space: a speed–accuracy trade-off
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8120936/
https://www.ncbi.nlm.nih.gov/pubmed/34025948
http://dx.doi.org/10.1016/j.csbj.2021.04.049
work_keys_str_mv AT posticguillaume representationsofproteinstructureforexploringtheconformationalspaceaspeedaccuracytradeoff
AT janelnathalie representationsofproteinstructureforexploringtheconformationalspaceaspeedaccuracytradeoff
AT moroygautier representationsofproteinstructureforexploringtheconformationalspaceaspeedaccuracytradeoff