Cargando…

Transfer learning to leverage larger datasets for improved prediction of protein stability changes

Amino acid mutations that lower a protein’s thermodynamic stability are implicated in numerous diseases, and engineered proteins with enhanced stability are important in research and medicine. Computational methods for predicting how mutations perturb protein stability are therefore of great interes...

Descripción completa

Detalles Bibliográficos
Autores principales:	Dieckhaus, Henry, Brocidiacono, Michael, Randolph, Nicholas, Kuhlman, Brian
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Cold Spring Harbor Laboratory 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10402116/ https://www.ncbi.nlm.nih.gov/pubmed/37547004 http://dx.doi.org/10.1101/2023.07.27.550881

_version_	1785084803461152768
author	Dieckhaus, Henry Brocidiacono, Michael Randolph, Nicholas Kuhlman, Brian
author_facet	Dieckhaus, Henry Brocidiacono, Michael Randolph, Nicholas Kuhlman, Brian
author_sort	Dieckhaus, Henry
collection	PubMed
description	Amino acid mutations that lower a protein’s thermodynamic stability are implicated in numerous diseases, and engineered proteins with enhanced stability are important in research and medicine. Computational methods for predicting how mutations perturb protein stability are therefore of great interest. Despite recent advancements in protein design using deep learning, in silico prediction of stability changes has remained challenging, in part due to a lack of large, high-quality training datasets for model development. Here we introduce ThermoMPNN, a deep neural network trained to predict stability changes for protein point mutations given an initial structure. In doing so, we demonstrate the utility of a newly released mega-scale stability dataset for training a robust stability model. We also employ transfer learning to leverage a second, larger dataset by using learned features extracted from a deep neural network trained to predict a protein’s amino acid sequence given its three-dimensional structure. We show that our method achieves competitive performance on established benchmark datasets using a lightweight model architecture that allows for rapid, scalable predictions. Finally, we make ThermoMPNN readily available as a tool for stability prediction and design.
format	Online Article Text
id	pubmed-10402116
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Cold Spring Harbor Laboratory
record_format	MEDLINE/PubMed
spelling	pubmed-104021162023-08-05 Transfer learning to leverage larger datasets for improved prediction of protein stability changes Dieckhaus, Henry Brocidiacono, Michael Randolph, Nicholas Kuhlman, Brian bioRxiv Article Amino acid mutations that lower a protein’s thermodynamic stability are implicated in numerous diseases, and engineered proteins with enhanced stability are important in research and medicine. Computational methods for predicting how mutations perturb protein stability are therefore of great interest. Despite recent advancements in protein design using deep learning, in silico prediction of stability changes has remained challenging, in part due to a lack of large, high-quality training datasets for model development. Here we introduce ThermoMPNN, a deep neural network trained to predict stability changes for protein point mutations given an initial structure. In doing so, we demonstrate the utility of a newly released mega-scale stability dataset for training a robust stability model. We also employ transfer learning to leverage a second, larger dataset by using learned features extracted from a deep neural network trained to predict a protein’s amino acid sequence given its three-dimensional structure. We show that our method achieves competitive performance on established benchmark datasets using a lightweight model architecture that allows for rapid, scalable predictions. Finally, we make ThermoMPNN readily available as a tool for stability prediction and design. Cold Spring Harbor Laboratory 2023-07-30 /pmc/articles/PMC10402116/ /pubmed/37547004 http://dx.doi.org/10.1101/2023.07.27.550881 Text en https://creativecommons.org/licenses/by-nd/4.0/This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, and only so long as attribution is given to the creator. The license allows for commercial use.
spellingShingle	Article Dieckhaus, Henry Brocidiacono, Michael Randolph, Nicholas Kuhlman, Brian Transfer learning to leverage larger datasets for improved prediction of protein stability changes
title	Transfer learning to leverage larger datasets for improved prediction of protein stability changes
title_full	Transfer learning to leverage larger datasets for improved prediction of protein stability changes
title_fullStr	Transfer learning to leverage larger datasets for improved prediction of protein stability changes
title_full_unstemmed	Transfer learning to leverage larger datasets for improved prediction of protein stability changes
title_short	Transfer learning to leverage larger datasets for improved prediction of protein stability changes
title_sort	transfer learning to leverage larger datasets for improved prediction of protein stability changes
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10402116/ https://www.ncbi.nlm.nih.gov/pubmed/37547004 http://dx.doi.org/10.1101/2023.07.27.550881
work_keys_str_mv	AT dieckhaushenry transferlearningtoleveragelargerdatasetsforimprovedpredictionofproteinstabilitychanges AT brocidiaconomichael transferlearningtoleveragelargerdatasetsforimprovedpredictionofproteinstabilitychanges AT randolphnicholas transferlearningtoleveragelargerdatasetsforimprovedpredictionofproteinstabilitychanges AT kuhlmanbrian transferlearningtoleveragelargerdatasetsforimprovedpredictionofproteinstabilitychanges

Transfer learning to leverage larger datasets for improved prediction of protein stability changes

Ejemplares similares