Cargando…

Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network

Rational protein design aims at the targeted modification of existing proteins. To reach this goal, software suites like Rosetta propose sequences to introduce the desired properties. Challenging design problems necessitate the representation of a protein by means of a structural ensemble. Thus, Ros...

Descripción completa

Detalles Bibliográficos
Autores principales: Nazet, Julian, Lang, Elmar, Merkl, Rainer
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8389498/
https://www.ncbi.nlm.nih.gov/pubmed/34437621
http://dx.doi.org/10.1371/journal.pone.0256691
_version_ 1783742872511578112
author Nazet, Julian
Lang, Elmar
Merkl, Rainer
author_facet Nazet, Julian
Lang, Elmar
Merkl, Rainer
author_sort Nazet, Julian
collection PubMed
description Rational protein design aims at the targeted modification of existing proteins. To reach this goal, software suites like Rosetta propose sequences to introduce the desired properties. Challenging design problems necessitate the representation of a protein by means of a structural ensemble. Thus, Rosetta multi-state design (MSD) protocols have been developed wherein each state represents one protein conformation. Computational demands of MSD protocols are high, because for each of the candidate sequences a costly three-dimensional (3D) model has to be created and assessed for all states. Each of these scores contributes one data point to a complex, design-specific energy landscape. As neural networks (NN) proved well-suited to learn such solution spaces, we integrated one into the framework Rosetta:MSF instead of the so far used genetic algorithm with the aim to reduce computational costs. As its predecessor, Rosetta:MSF:NN administers a set of candidate sequences and their scores and scans sequence space iteratively. During each iteration, the union of all candidate sequences and their Rosetta scores are used to re-train NNs that possess a design-specific architecture. The enormous speed of the NNs allows an extensive assessment of alternative sequences, which are ranked on the scores predicted by the NN. Costly 3D models are computed only for a small fraction of best-scoring sequences; these and the corresponding 3D-based scores replace half of the candidate sequences during each iteration. The analysis of two sets of candidate sequences generated for a specific design problem by means of a genetic algorithm confirmed that the NN predicted 3D-based scores quite well; the Pearson correlation coefficient was at least 0.95. Applying Rosetta:MSF:NN:enzdes to a benchmark consisting of 16 ligand-binding problems showed that this protocol converges ten-times faster than the genetic algorithm and finds sequences with comparable scores.
format Online
Article
Text
id pubmed-8389498
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-83894982021-08-27 Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network Nazet, Julian Lang, Elmar Merkl, Rainer PLoS One Research Article Rational protein design aims at the targeted modification of existing proteins. To reach this goal, software suites like Rosetta propose sequences to introduce the desired properties. Challenging design problems necessitate the representation of a protein by means of a structural ensemble. Thus, Rosetta multi-state design (MSD) protocols have been developed wherein each state represents one protein conformation. Computational demands of MSD protocols are high, because for each of the candidate sequences a costly three-dimensional (3D) model has to be created and assessed for all states. Each of these scores contributes one data point to a complex, design-specific energy landscape. As neural networks (NN) proved well-suited to learn such solution spaces, we integrated one into the framework Rosetta:MSF instead of the so far used genetic algorithm with the aim to reduce computational costs. As its predecessor, Rosetta:MSF:NN administers a set of candidate sequences and their scores and scans sequence space iteratively. During each iteration, the union of all candidate sequences and their Rosetta scores are used to re-train NNs that possess a design-specific architecture. The enormous speed of the NNs allows an extensive assessment of alternative sequences, which are ranked on the scores predicted by the NN. Costly 3D models are computed only for a small fraction of best-scoring sequences; these and the corresponding 3D-based scores replace half of the candidate sequences during each iteration. The analysis of two sets of candidate sequences generated for a specific design problem by means of a genetic algorithm confirmed that the NN predicted 3D-based scores quite well; the Pearson correlation coefficient was at least 0.95. Applying Rosetta:MSF:NN:enzdes to a benchmark consisting of 16 ligand-binding problems showed that this protocol converges ten-times faster than the genetic algorithm and finds sequences with comparable scores. Public Library of Science 2021-08-26 /pmc/articles/PMC8389498/ /pubmed/34437621 http://dx.doi.org/10.1371/journal.pone.0256691 Text en © 2021 Nazet et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Nazet, Julian
Lang, Elmar
Merkl, Rainer
Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network
title Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network
title_full Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network
title_fullStr Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network
title_full_unstemmed Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network
title_short Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network
title_sort rosetta:msf:nn: boosting performance of multi-state computational protein design with a neural network
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8389498/
https://www.ncbi.nlm.nih.gov/pubmed/34437621
http://dx.doi.org/10.1371/journal.pone.0256691
work_keys_str_mv AT nazetjulian rosettamsfnnboostingperformanceofmultistatecomputationalproteindesignwithaneuralnetwork
AT langelmar rosettamsfnnboostingperformanceofmultistatecomputationalproteindesignwithaneuralnetwork
AT merklrainer rosettamsfnnboostingperformanceofmultistatecomputationalproteindesignwithaneuralnetwork