Cargando…
Mega-scale experimental analysis of protein folding stability in biology and design
Advances in DNA sequencing and machine learning are providing insights into protein sequences and structures on an enormous scale(1). However, the energetics driving folding are invisible in these structures and remain largely unknown(2). The hidden thermodynamics of folding can drive disease(3,4),...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10412457/ https://www.ncbi.nlm.nih.gov/pubmed/37468638 http://dx.doi.org/10.1038/s41586-023-06328-6 |
_version_ | 1785086909146464256 |
---|---|
author | Tsuboyama, Kotaro Dauparas, Justas Chen, Jonathan Laine, Elodie Mohseni Behbahani, Yasser Weinstein, Jonathan J. Mangan, Niall M. Ovchinnikov, Sergey Rocklin, Gabriel J. |
author_facet | Tsuboyama, Kotaro Dauparas, Justas Chen, Jonathan Laine, Elodie Mohseni Behbahani, Yasser Weinstein, Jonathan J. Mangan, Niall M. Ovchinnikov, Sergey Rocklin, Gabriel J. |
author_sort | Tsuboyama, Kotaro |
collection | PubMed |
description | Advances in DNA sequencing and machine learning are providing insights into protein sequences and structures on an enormous scale(1). However, the energetics driving folding are invisible in these structures and remain largely unknown(2). The hidden thermodynamics of folding can drive disease(3,4), shape protein evolution(5–7) and guide protein engineering(8–10), and new approaches are needed to reveal these thermodynamics for every sequence and structure. Here we present cDNA display proteolysis, a method for measuring thermodynamic folding stability for up to 900,000 protein domains in a one-week experiment. From 1.8 million measurements in total, we curated a set of around 776,000 high-quality folding stabilities covering all single amino acid variants and selected double mutants of 331 natural and 148 de novo designed protein domains 40–72 amino acids in length. Using this extensive dataset, we quantified (1) environmental factors influencing amino acid fitness, (2) thermodynamic couplings (including unexpected interactions) between protein sites, and (3) the global divergence between evolutionary amino acid usage and protein folding stability. We also examined how our approach could identify stability determinants in designed proteins and evaluate design methods. The cDNA display proteolysis method is fast, accurate and uniquely scalable, and promises to reveal the quantitative rules for how amino acid sequences encode folding stability. |
format | Online Article Text |
id | pubmed-10412457 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-104124572023-08-11 Mega-scale experimental analysis of protein folding stability in biology and design Tsuboyama, Kotaro Dauparas, Justas Chen, Jonathan Laine, Elodie Mohseni Behbahani, Yasser Weinstein, Jonathan J. Mangan, Niall M. Ovchinnikov, Sergey Rocklin, Gabriel J. Nature Article Advances in DNA sequencing and machine learning are providing insights into protein sequences and structures on an enormous scale(1). However, the energetics driving folding are invisible in these structures and remain largely unknown(2). The hidden thermodynamics of folding can drive disease(3,4), shape protein evolution(5–7) and guide protein engineering(8–10), and new approaches are needed to reveal these thermodynamics for every sequence and structure. Here we present cDNA display proteolysis, a method for measuring thermodynamic folding stability for up to 900,000 protein domains in a one-week experiment. From 1.8 million measurements in total, we curated a set of around 776,000 high-quality folding stabilities covering all single amino acid variants and selected double mutants of 331 natural and 148 de novo designed protein domains 40–72 amino acids in length. Using this extensive dataset, we quantified (1) environmental factors influencing amino acid fitness, (2) thermodynamic couplings (including unexpected interactions) between protein sites, and (3) the global divergence between evolutionary amino acid usage and protein folding stability. We also examined how our approach could identify stability determinants in designed proteins and evaluate design methods. The cDNA display proteolysis method is fast, accurate and uniquely scalable, and promises to reveal the quantitative rules for how amino acid sequences encode folding stability. Nature Publishing Group UK 2023-07-19 2023 /pmc/articles/PMC10412457/ /pubmed/37468638 http://dx.doi.org/10.1038/s41586-023-06328-6 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Tsuboyama, Kotaro Dauparas, Justas Chen, Jonathan Laine, Elodie Mohseni Behbahani, Yasser Weinstein, Jonathan J. Mangan, Niall M. Ovchinnikov, Sergey Rocklin, Gabriel J. Mega-scale experimental analysis of protein folding stability in biology and design |
title | Mega-scale experimental analysis of protein folding stability in biology and design |
title_full | Mega-scale experimental analysis of protein folding stability in biology and design |
title_fullStr | Mega-scale experimental analysis of protein folding stability in biology and design |
title_full_unstemmed | Mega-scale experimental analysis of protein folding stability in biology and design |
title_short | Mega-scale experimental analysis of protein folding stability in biology and design |
title_sort | mega-scale experimental analysis of protein folding stability in biology and design |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10412457/ https://www.ncbi.nlm.nih.gov/pubmed/37468638 http://dx.doi.org/10.1038/s41586-023-06328-6 |
work_keys_str_mv | AT tsuboyamakotaro megascaleexperimentalanalysisofproteinfoldingstabilityinbiologyanddesign AT dauparasjustas megascaleexperimentalanalysisofproteinfoldingstabilityinbiologyanddesign AT chenjonathan megascaleexperimentalanalysisofproteinfoldingstabilityinbiologyanddesign AT laineelodie megascaleexperimentalanalysisofproteinfoldingstabilityinbiologyanddesign AT mohsenibehbahaniyasser megascaleexperimentalanalysisofproteinfoldingstabilityinbiologyanddesign AT weinsteinjonathanj megascaleexperimentalanalysisofproteinfoldingstabilityinbiologyanddesign AT manganniallm megascaleexperimentalanalysisofproteinfoldingstabilityinbiologyanddesign AT ovchinnikovsergey megascaleexperimentalanalysisofproteinfoldingstabilityinbiologyanddesign AT rocklingabrielj megascaleexperimentalanalysisofproteinfoldingstabilityinbiologyanddesign |