Cargando…

Protein sequence design by conformational landscape optimization

The protein design problem is to identify an amino acid sequence that folds to a desired structure. Given Anfinsen’s thermodynamic hypothesis of folding, this can be recast as finding an amino acid sequence for which the desired structure is the lowest energy state. As this calculation involves not...

Descripción completa

Detalles Bibliográficos
Autores principales: Norn, Christoffer, Wicky, Basile I. M., Juergens, David, Liu, Sirui, Kim, David, Tischer, Doug, Koepnick, Brian, Anishchenko, Ivan, Baker, David, Ovchinnikov, Sergey
Formato: Online Artículo Texto
Lenguaje:English
Publicado: National Academy of Sciences 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7980421/
https://www.ncbi.nlm.nih.gov/pubmed/33712545
http://dx.doi.org/10.1073/pnas.2017228118
_version_ 1783667437133103104
author Norn, Christoffer
Wicky, Basile I. M.
Juergens, David
Liu, Sirui
Kim, David
Tischer, Doug
Koepnick, Brian
Anishchenko, Ivan
Baker, David
Ovchinnikov, Sergey
author_facet Norn, Christoffer
Wicky, Basile I. M.
Juergens, David
Liu, Sirui
Kim, David
Tischer, Doug
Koepnick, Brian
Anishchenko, Ivan
Baker, David
Ovchinnikov, Sergey
author_sort Norn, Christoffer
collection PubMed
description The protein design problem is to identify an amino acid sequence that folds to a desired structure. Given Anfinsen’s thermodynamic hypothesis of folding, this can be recast as finding an amino acid sequence for which the desired structure is the lowest energy state. As this calculation involves not only all possible amino acid sequences but also, all possible structures, most current approaches focus instead on the more tractable problem of finding the lowest-energy amino acid sequence for the desired structure, often checking by protein structure prediction in a second step that the desired structure is indeed the lowest-energy conformation for the designed sequence, and typically discarding a large fraction of designed sequences for which this is not the case. Here, we show that by backpropagating gradients through the transform-restrained Rosetta (trRosetta) structure prediction network from the desired structure to the input amino acid sequence, we can directly optimize over all possible amino acid sequences and all possible structures in a single calculation. We find that trRosetta calculations, which consider the full conformational landscape, can be more effective than Rosetta single-point energy estimations in predicting folding and stability of de novo designed proteins. We compare sequence design by conformational landscape optimization with the standard energy-based sequence design methodology in Rosetta and show that the former can result in energy landscapes with fewer alternative energy minima. We show further that more funneled energy landscapes can be designed by combining the strengths of the two approaches: the low-resolution trRosetta model serves to disfavor alternative states, and the high-resolution Rosetta model serves to create a deep energy minimum at the design target structure.
format Online
Article
Text
id pubmed-7980421
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher National Academy of Sciences
record_format MEDLINE/PubMed
spelling pubmed-79804212021-03-26 Protein sequence design by conformational landscape optimization Norn, Christoffer Wicky, Basile I. M. Juergens, David Liu, Sirui Kim, David Tischer, Doug Koepnick, Brian Anishchenko, Ivan Baker, David Ovchinnikov, Sergey Proc Natl Acad Sci U S A Physical Sciences The protein design problem is to identify an amino acid sequence that folds to a desired structure. Given Anfinsen’s thermodynamic hypothesis of folding, this can be recast as finding an amino acid sequence for which the desired structure is the lowest energy state. As this calculation involves not only all possible amino acid sequences but also, all possible structures, most current approaches focus instead on the more tractable problem of finding the lowest-energy amino acid sequence for the desired structure, often checking by protein structure prediction in a second step that the desired structure is indeed the lowest-energy conformation for the designed sequence, and typically discarding a large fraction of designed sequences for which this is not the case. Here, we show that by backpropagating gradients through the transform-restrained Rosetta (trRosetta) structure prediction network from the desired structure to the input amino acid sequence, we can directly optimize over all possible amino acid sequences and all possible structures in a single calculation. We find that trRosetta calculations, which consider the full conformational landscape, can be more effective than Rosetta single-point energy estimations in predicting folding and stability of de novo designed proteins. We compare sequence design by conformational landscape optimization with the standard energy-based sequence design methodology in Rosetta and show that the former can result in energy landscapes with fewer alternative energy minima. We show further that more funneled energy landscapes can be designed by combining the strengths of the two approaches: the low-resolution trRosetta model serves to disfavor alternative states, and the high-resolution Rosetta model serves to create a deep energy minimum at the design target structure. National Academy of Sciences 2021-03-16 2021-03-12 /pmc/articles/PMC7980421/ /pubmed/33712545 http://dx.doi.org/10.1073/pnas.2017228118 Text en Copyright © 2021 the Author(s). Published by PNAS. https://creativecommons.org/licenses/by-nc-nd/4.0/ https://creativecommons.org/licenses/by-nc-nd/4.0/This open access article is distributed under Creative Commons Attribution-NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND) (https://creativecommons.org/licenses/by-nc-nd/4.0/) .
spellingShingle Physical Sciences
Norn, Christoffer
Wicky, Basile I. M.
Juergens, David
Liu, Sirui
Kim, David
Tischer, Doug
Koepnick, Brian
Anishchenko, Ivan
Baker, David
Ovchinnikov, Sergey
Protein sequence design by conformational landscape optimization
title Protein sequence design by conformational landscape optimization
title_full Protein sequence design by conformational landscape optimization
title_fullStr Protein sequence design by conformational landscape optimization
title_full_unstemmed Protein sequence design by conformational landscape optimization
title_short Protein sequence design by conformational landscape optimization
title_sort protein sequence design by conformational landscape optimization
topic Physical Sciences
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7980421/
https://www.ncbi.nlm.nih.gov/pubmed/33712545
http://dx.doi.org/10.1073/pnas.2017228118
work_keys_str_mv AT nornchristoffer proteinsequencedesignbyconformationallandscapeoptimization
AT wickybasileim proteinsequencedesignbyconformationallandscapeoptimization
AT juergensdavid proteinsequencedesignbyconformationallandscapeoptimization
AT liusirui proteinsequencedesignbyconformationallandscapeoptimization
AT kimdavid proteinsequencedesignbyconformationallandscapeoptimization
AT tischerdoug proteinsequencedesignbyconformationallandscapeoptimization
AT koepnickbrian proteinsequencedesignbyconformationallandscapeoptimization
AT anishchenkoivan proteinsequencedesignbyconformationallandscapeoptimization
AT proteinsequencedesignbyconformationallandscapeoptimization
AT bakerdavid proteinsequencedesignbyconformationallandscapeoptimization
AT ovchinnikovsergey proteinsequencedesignbyconformationallandscapeoptimization