Cargando…

PON-Fold: Prediction of Substitutions Affecting Protein Folding Rate

Most proteins fold into characteristic three-dimensional structures. The rate of folding and unfolding varies widely and can be affected by variations in proteins. We developed a novel machine-learning-based method for the prediction of the folding rate effects of amino acid substitutions in two-sta...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Yang, Chong, Zhang, Vihinen, Mauno
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10455311/
https://www.ncbi.nlm.nih.gov/pubmed/37629203
http://dx.doi.org/10.3390/ijms241613023
_version_ 1785096421800673280
author Yang, Yang
Chong, Zhang
Vihinen, Mauno
author_facet Yang, Yang
Chong, Zhang
Vihinen, Mauno
author_sort Yang, Yang
collection PubMed
description Most proteins fold into characteristic three-dimensional structures. The rate of folding and unfolding varies widely and can be affected by variations in proteins. We developed a novel machine-learning-based method for the prediction of the folding rate effects of amino acid substitutions in two-state folding proteins. We collected a data set of experimentally defined folding rates for variants and used them to train a gradient boosting algorithm starting with 1161 features. Two predictors were designed. The three-class classifier had, in blind tests, specificity and sensitivity ranging from 0.324 to 0.419 and from 0.256 to 0.451, respectively. The other tool was a regression predictor that showed a Pearson correlation coefficient of 0.525. The error measures, mean absolute error and mean squared error, were 0.581 and 0.603, respectively. One of the previously presented tools could be used for comparison with the blind test data set, our method called PON-Fold showed superior performance on all used measures. The applicability of the tool was tested by predicting all possible substitutions in a protein domain. Predictions for different conformations of proteins, open and closed forms of a protein kinase, and apo and holo forms of an enzyme indicated that the choice of the structure had a large impact on the outcome. PON-Fold is freely available.
format Online
Article
Text
id pubmed-10455311
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-104553112023-08-26 PON-Fold: Prediction of Substitutions Affecting Protein Folding Rate Yang, Yang Chong, Zhang Vihinen, Mauno Int J Mol Sci Article Most proteins fold into characteristic three-dimensional structures. The rate of folding and unfolding varies widely and can be affected by variations in proteins. We developed a novel machine-learning-based method for the prediction of the folding rate effects of amino acid substitutions in two-state folding proteins. We collected a data set of experimentally defined folding rates for variants and used them to train a gradient boosting algorithm starting with 1161 features. Two predictors were designed. The three-class classifier had, in blind tests, specificity and sensitivity ranging from 0.324 to 0.419 and from 0.256 to 0.451, respectively. The other tool was a regression predictor that showed a Pearson correlation coefficient of 0.525. The error measures, mean absolute error and mean squared error, were 0.581 and 0.603, respectively. One of the previously presented tools could be used for comparison with the blind test data set, our method called PON-Fold showed superior performance on all used measures. The applicability of the tool was tested by predicting all possible substitutions in a protein domain. Predictions for different conformations of proteins, open and closed forms of a protein kinase, and apo and holo forms of an enzyme indicated that the choice of the structure had a large impact on the outcome. PON-Fold is freely available. MDPI 2023-08-21 /pmc/articles/PMC10455311/ /pubmed/37629203 http://dx.doi.org/10.3390/ijms241613023 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Yang, Yang
Chong, Zhang
Vihinen, Mauno
PON-Fold: Prediction of Substitutions Affecting Protein Folding Rate
title PON-Fold: Prediction of Substitutions Affecting Protein Folding Rate
title_full PON-Fold: Prediction of Substitutions Affecting Protein Folding Rate
title_fullStr PON-Fold: Prediction of Substitutions Affecting Protein Folding Rate
title_full_unstemmed PON-Fold: Prediction of Substitutions Affecting Protein Folding Rate
title_short PON-Fold: Prediction of Substitutions Affecting Protein Folding Rate
title_sort pon-fold: prediction of substitutions affecting protein folding rate
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10455311/
https://www.ncbi.nlm.nih.gov/pubmed/37629203
http://dx.doi.org/10.3390/ijms241613023
work_keys_str_mv AT yangyang ponfoldpredictionofsubstitutionsaffectingproteinfoldingrate
AT chongzhang ponfoldpredictionofsubstitutionsaffectingproteinfoldingrate
AT vihinenmauno ponfoldpredictionofsubstitutionsaffectingproteinfoldingrate