Cargando…
Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network
Rational protein design aims at the targeted modification of existing proteins. To reach this goal, software suites like Rosetta propose sequences to introduce the desired properties. Challenging design problems necessitate the representation of a protein by means of a structural ensemble. Thus, Ros...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8389498/ https://www.ncbi.nlm.nih.gov/pubmed/34437621 http://dx.doi.org/10.1371/journal.pone.0256691 |
_version_ | 1783742872511578112 |
---|---|
author | Nazet, Julian Lang, Elmar Merkl, Rainer |
author_facet | Nazet, Julian Lang, Elmar Merkl, Rainer |
author_sort | Nazet, Julian |
collection | PubMed |
description | Rational protein design aims at the targeted modification of existing proteins. To reach this goal, software suites like Rosetta propose sequences to introduce the desired properties. Challenging design problems necessitate the representation of a protein by means of a structural ensemble. Thus, Rosetta multi-state design (MSD) protocols have been developed wherein each state represents one protein conformation. Computational demands of MSD protocols are high, because for each of the candidate sequences a costly three-dimensional (3D) model has to be created and assessed for all states. Each of these scores contributes one data point to a complex, design-specific energy landscape. As neural networks (NN) proved well-suited to learn such solution spaces, we integrated one into the framework Rosetta:MSF instead of the so far used genetic algorithm with the aim to reduce computational costs. As its predecessor, Rosetta:MSF:NN administers a set of candidate sequences and their scores and scans sequence space iteratively. During each iteration, the union of all candidate sequences and their Rosetta scores are used to re-train NNs that possess a design-specific architecture. The enormous speed of the NNs allows an extensive assessment of alternative sequences, which are ranked on the scores predicted by the NN. Costly 3D models are computed only for a small fraction of best-scoring sequences; these and the corresponding 3D-based scores replace half of the candidate sequences during each iteration. The analysis of two sets of candidate sequences generated for a specific design problem by means of a genetic algorithm confirmed that the NN predicted 3D-based scores quite well; the Pearson correlation coefficient was at least 0.95. Applying Rosetta:MSF:NN:enzdes to a benchmark consisting of 16 ligand-binding problems showed that this protocol converges ten-times faster than the genetic algorithm and finds sequences with comparable scores. |
format | Online Article Text |
id | pubmed-8389498 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-83894982021-08-27 Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network Nazet, Julian Lang, Elmar Merkl, Rainer PLoS One Research Article Rational protein design aims at the targeted modification of existing proteins. To reach this goal, software suites like Rosetta propose sequences to introduce the desired properties. Challenging design problems necessitate the representation of a protein by means of a structural ensemble. Thus, Rosetta multi-state design (MSD) protocols have been developed wherein each state represents one protein conformation. Computational demands of MSD protocols are high, because for each of the candidate sequences a costly three-dimensional (3D) model has to be created and assessed for all states. Each of these scores contributes one data point to a complex, design-specific energy landscape. As neural networks (NN) proved well-suited to learn such solution spaces, we integrated one into the framework Rosetta:MSF instead of the so far used genetic algorithm with the aim to reduce computational costs. As its predecessor, Rosetta:MSF:NN administers a set of candidate sequences and their scores and scans sequence space iteratively. During each iteration, the union of all candidate sequences and their Rosetta scores are used to re-train NNs that possess a design-specific architecture. The enormous speed of the NNs allows an extensive assessment of alternative sequences, which are ranked on the scores predicted by the NN. Costly 3D models are computed only for a small fraction of best-scoring sequences; these and the corresponding 3D-based scores replace half of the candidate sequences during each iteration. The analysis of two sets of candidate sequences generated for a specific design problem by means of a genetic algorithm confirmed that the NN predicted 3D-based scores quite well; the Pearson correlation coefficient was at least 0.95. Applying Rosetta:MSF:NN:enzdes to a benchmark consisting of 16 ligand-binding problems showed that this protocol converges ten-times faster than the genetic algorithm and finds sequences with comparable scores. Public Library of Science 2021-08-26 /pmc/articles/PMC8389498/ /pubmed/34437621 http://dx.doi.org/10.1371/journal.pone.0256691 Text en © 2021 Nazet et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Nazet, Julian Lang, Elmar Merkl, Rainer Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network |
title | Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network |
title_full | Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network |
title_fullStr | Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network |
title_full_unstemmed | Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network |
title_short | Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network |
title_sort | rosetta:msf:nn: boosting performance of multi-state computational protein design with a neural network |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8389498/ https://www.ncbi.nlm.nih.gov/pubmed/34437621 http://dx.doi.org/10.1371/journal.pone.0256691 |
work_keys_str_mv | AT nazetjulian rosettamsfnnboostingperformanceofmultistatecomputationalproteindesignwithaneuralnetwork AT langelmar rosettamsfnnboostingperformanceofmultistatecomputationalproteindesignwithaneuralnetwork AT merklrainer rosettamsfnnboostingperformanceofmultistatecomputationalproteindesignwithaneuralnetwork |