Cargando…
Sampling of structure and sequence space of small protein folds
Nature only samples a small fraction of the sequence space that can fold into stable proteins. Furthermore, small structural variations in a single fold, sometimes only a few amino acids, can define a protein’s molecular function. Hence, to design proteins with novel functionalities, such as molecul...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9684540/ https://www.ncbi.nlm.nih.gov/pubmed/36418330 http://dx.doi.org/10.1038/s41467-022-34937-8 |
_version_ | 1784835309420150784 |
---|---|
author | Linsky, Thomas W. Noble, Kyle Tobin, Autumn R. Crow, Rachel Carter, Lauren Urbauer, Jeffrey L. Baker, David Strauch, Eva-Maria |
author_facet | Linsky, Thomas W. Noble, Kyle Tobin, Autumn R. Crow, Rachel Carter, Lauren Urbauer, Jeffrey L. Baker, David Strauch, Eva-Maria |
author_sort | Linsky, Thomas W. |
collection | PubMed |
description | Nature only samples a small fraction of the sequence space that can fold into stable proteins. Furthermore, small structural variations in a single fold, sometimes only a few amino acids, can define a protein’s molecular function. Hence, to design proteins with novel functionalities, such as molecular recognition, methods to control and sample shape diversity are necessary. To explore this space, we developed and experimentally validated a computational platform that can design a wide variety of small protein folds while sampling shape diversity. We designed and evaluated stability of about 30,000 de novo protein designs of eight different folds. Among these designs, about 6,200 stable proteins were identified, including some predicted to have a first-of-its-kind minimalized thioredoxin fold. Obtained data revealed protein folding rules for structural features such as helix-connecting loops. Beyond serving as a resource for protein engineering, this massive and diverse dataset also provides training data for machine learning. We developed an accurate classifier to predict the stability of our designed proteins. The methods and the wide range of protein shapes provide a basis for designing new protein functions without compromising stability. |
format | Online Article Text |
id | pubmed-9684540 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-96845402022-11-25 Sampling of structure and sequence space of small protein folds Linsky, Thomas W. Noble, Kyle Tobin, Autumn R. Crow, Rachel Carter, Lauren Urbauer, Jeffrey L. Baker, David Strauch, Eva-Maria Nat Commun Article Nature only samples a small fraction of the sequence space that can fold into stable proteins. Furthermore, small structural variations in a single fold, sometimes only a few amino acids, can define a protein’s molecular function. Hence, to design proteins with novel functionalities, such as molecular recognition, methods to control and sample shape diversity are necessary. To explore this space, we developed and experimentally validated a computational platform that can design a wide variety of small protein folds while sampling shape diversity. We designed and evaluated stability of about 30,000 de novo protein designs of eight different folds. Among these designs, about 6,200 stable proteins were identified, including some predicted to have a first-of-its-kind minimalized thioredoxin fold. Obtained data revealed protein folding rules for structural features such as helix-connecting loops. Beyond serving as a resource for protein engineering, this massive and diverse dataset also provides training data for machine learning. We developed an accurate classifier to predict the stability of our designed proteins. The methods and the wide range of protein shapes provide a basis for designing new protein functions without compromising stability. Nature Publishing Group UK 2022-11-22 /pmc/articles/PMC9684540/ /pubmed/36418330 http://dx.doi.org/10.1038/s41467-022-34937-8 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Linsky, Thomas W. Noble, Kyle Tobin, Autumn R. Crow, Rachel Carter, Lauren Urbauer, Jeffrey L. Baker, David Strauch, Eva-Maria Sampling of structure and sequence space of small protein folds |
title | Sampling of structure and sequence space of small protein folds |
title_full | Sampling of structure and sequence space of small protein folds |
title_fullStr | Sampling of structure and sequence space of small protein folds |
title_full_unstemmed | Sampling of structure and sequence space of small protein folds |
title_short | Sampling of structure and sequence space of small protein folds |
title_sort | sampling of structure and sequence space of small protein folds |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9684540/ https://www.ncbi.nlm.nih.gov/pubmed/36418330 http://dx.doi.org/10.1038/s41467-022-34937-8 |
work_keys_str_mv | AT linskythomasw samplingofstructureandsequencespaceofsmallproteinfolds AT noblekyle samplingofstructureandsequencespaceofsmallproteinfolds AT tobinautumnr samplingofstructureandsequencespaceofsmallproteinfolds AT crowrachel samplingofstructureandsequencespaceofsmallproteinfolds AT carterlauren samplingofstructureandsequencespaceofsmallproteinfolds AT urbauerjeffreyl samplingofstructureandsequencespaceofsmallproteinfolds AT bakerdavid samplingofstructureandsequencespaceofsmallproteinfolds AT strauchevamaria samplingofstructureandsequencespaceofsmallproteinfolds |