Cargando…
Protein sequence design by conformational landscape optimization
The protein design problem is to identify an amino acid sequence that folds to a desired structure. Given Anfinsen’s thermodynamic hypothesis of folding, this can be recast as finding an amino acid sequence for which the desired structure is the lowest energy state. As this calculation involves not...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
National Academy of Sciences
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7980421/ https://www.ncbi.nlm.nih.gov/pubmed/33712545 http://dx.doi.org/10.1073/pnas.2017228118 |
_version_ | 1783667437133103104 |
---|---|
author | Norn, Christoffer Wicky, Basile I. M. Juergens, David Liu, Sirui Kim, David Tischer, Doug Koepnick, Brian Anishchenko, Ivan Baker, David Ovchinnikov, Sergey |
author_facet | Norn, Christoffer Wicky, Basile I. M. Juergens, David Liu, Sirui Kim, David Tischer, Doug Koepnick, Brian Anishchenko, Ivan Baker, David Ovchinnikov, Sergey |
author_sort | Norn, Christoffer |
collection | PubMed |
description | The protein design problem is to identify an amino acid sequence that folds to a desired structure. Given Anfinsen’s thermodynamic hypothesis of folding, this can be recast as finding an amino acid sequence for which the desired structure is the lowest energy state. As this calculation involves not only all possible amino acid sequences but also, all possible structures, most current approaches focus instead on the more tractable problem of finding the lowest-energy amino acid sequence for the desired structure, often checking by protein structure prediction in a second step that the desired structure is indeed the lowest-energy conformation for the designed sequence, and typically discarding a large fraction of designed sequences for which this is not the case. Here, we show that by backpropagating gradients through the transform-restrained Rosetta (trRosetta) structure prediction network from the desired structure to the input amino acid sequence, we can directly optimize over all possible amino acid sequences and all possible structures in a single calculation. We find that trRosetta calculations, which consider the full conformational landscape, can be more effective than Rosetta single-point energy estimations in predicting folding and stability of de novo designed proteins. We compare sequence design by conformational landscape optimization with the standard energy-based sequence design methodology in Rosetta and show that the former can result in energy landscapes with fewer alternative energy minima. We show further that more funneled energy landscapes can be designed by combining the strengths of the two approaches: the low-resolution trRosetta model serves to disfavor alternative states, and the high-resolution Rosetta model serves to create a deep energy minimum at the design target structure. |
format | Online Article Text |
id | pubmed-7980421 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | National Academy of Sciences |
record_format | MEDLINE/PubMed |
spelling | pubmed-79804212021-03-26 Protein sequence design by conformational landscape optimization Norn, Christoffer Wicky, Basile I. M. Juergens, David Liu, Sirui Kim, David Tischer, Doug Koepnick, Brian Anishchenko, Ivan Baker, David Ovchinnikov, Sergey Proc Natl Acad Sci U S A Physical Sciences The protein design problem is to identify an amino acid sequence that folds to a desired structure. Given Anfinsen’s thermodynamic hypothesis of folding, this can be recast as finding an amino acid sequence for which the desired structure is the lowest energy state. As this calculation involves not only all possible amino acid sequences but also, all possible structures, most current approaches focus instead on the more tractable problem of finding the lowest-energy amino acid sequence for the desired structure, often checking by protein structure prediction in a second step that the desired structure is indeed the lowest-energy conformation for the designed sequence, and typically discarding a large fraction of designed sequences for which this is not the case. Here, we show that by backpropagating gradients through the transform-restrained Rosetta (trRosetta) structure prediction network from the desired structure to the input amino acid sequence, we can directly optimize over all possible amino acid sequences and all possible structures in a single calculation. We find that trRosetta calculations, which consider the full conformational landscape, can be more effective than Rosetta single-point energy estimations in predicting folding and stability of de novo designed proteins. We compare sequence design by conformational landscape optimization with the standard energy-based sequence design methodology in Rosetta and show that the former can result in energy landscapes with fewer alternative energy minima. We show further that more funneled energy landscapes can be designed by combining the strengths of the two approaches: the low-resolution trRosetta model serves to disfavor alternative states, and the high-resolution Rosetta model serves to create a deep energy minimum at the design target structure. National Academy of Sciences 2021-03-16 2021-03-12 /pmc/articles/PMC7980421/ /pubmed/33712545 http://dx.doi.org/10.1073/pnas.2017228118 Text en Copyright © 2021 the Author(s). Published by PNAS. https://creativecommons.org/licenses/by-nc-nd/4.0/ https://creativecommons.org/licenses/by-nc-nd/4.0/This open access article is distributed under Creative Commons Attribution-NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND) (https://creativecommons.org/licenses/by-nc-nd/4.0/) . |
spellingShingle | Physical Sciences Norn, Christoffer Wicky, Basile I. M. Juergens, David Liu, Sirui Kim, David Tischer, Doug Koepnick, Brian Anishchenko, Ivan Baker, David Ovchinnikov, Sergey Protein sequence design by conformational landscape optimization |
title | Protein sequence design by conformational landscape optimization |
title_full | Protein sequence design by conformational landscape optimization |
title_fullStr | Protein sequence design by conformational landscape optimization |
title_full_unstemmed | Protein sequence design by conformational landscape optimization |
title_short | Protein sequence design by conformational landscape optimization |
title_sort | protein sequence design by conformational landscape optimization |
topic | Physical Sciences |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7980421/ https://www.ncbi.nlm.nih.gov/pubmed/33712545 http://dx.doi.org/10.1073/pnas.2017228118 |
work_keys_str_mv | AT nornchristoffer proteinsequencedesignbyconformationallandscapeoptimization AT wickybasileim proteinsequencedesignbyconformationallandscapeoptimization AT juergensdavid proteinsequencedesignbyconformationallandscapeoptimization AT liusirui proteinsequencedesignbyconformationallandscapeoptimization AT kimdavid proteinsequencedesignbyconformationallandscapeoptimization AT tischerdoug proteinsequencedesignbyconformationallandscapeoptimization AT koepnickbrian proteinsequencedesignbyconformationallandscapeoptimization AT anishchenkoivan proteinsequencedesignbyconformationallandscapeoptimization AT proteinsequencedesignbyconformationallandscapeoptimization AT bakerdavid proteinsequencedesignbyconformationallandscapeoptimization AT ovchinnikovsergey proteinsequencedesignbyconformationallandscapeoptimization |