Cargando…

Predicting protein stability changes upon single-point mutation: a thorough comparison of the available tools on a new dataset

Predicting the difference in thermodynamic stability between protein variants is crucial for protein design and understanding the genotype-phenotype relationships. So far, several computational tools have been created to address this task. Nevertheless, most of them have been trained or optimized on...

Descripción completa

Detalles Bibliográficos
Autores principales: Pancotti, Corrado, Benevenuta, Silvia, Birolo, Giovanni, Alberini, Virginia, Repetto, Valeria, Sanavia, Tiziana, Capriotti, Emidio, Fariselli, Piero
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8921618/
https://www.ncbi.nlm.nih.gov/pubmed/35021190
http://dx.doi.org/10.1093/bib/bbab555
_version_ 1784669359827845120
author Pancotti, Corrado
Benevenuta, Silvia
Birolo, Giovanni
Alberini, Virginia
Repetto, Valeria
Sanavia, Tiziana
Capriotti, Emidio
Fariselli, Piero
author_facet Pancotti, Corrado
Benevenuta, Silvia
Birolo, Giovanni
Alberini, Virginia
Repetto, Valeria
Sanavia, Tiziana
Capriotti, Emidio
Fariselli, Piero
author_sort Pancotti, Corrado
collection PubMed
description Predicting the difference in thermodynamic stability between protein variants is crucial for protein design and understanding the genotype-phenotype relationships. So far, several computational tools have been created to address this task. Nevertheless, most of them have been trained or optimized on the same and ‘all’ available data, making a fair comparison unfeasible. Here, we introduce a novel dataset, collected and manually cleaned from the latest version of the ThermoMutDB database, consisting of 669 variants not included in the most widely used training datasets. The prediction performance and the ability to satisfy the antisymmetry property by considering both direct and reverse variants were evaluated across 21 different tools. The Pearson correlations of the tested tools were in the ranges of 0.21–0.5 and 0–0.45 for the direct and reverse variants, respectively. When both direct and reverse variants are considered, the antisymmetric methods perform better achieving a Pearson correlation in the range of 0.51–0.62. The tested methods seem relatively insensitive to the physiological conditions, performing well also on the variants measured with more extreme pH and temperature values. A common issue with all the tested methods is the compression of the [Formula: see text] predictions toward zero. Furthermore, the thermodynamic stability of the most significantly stabilizing variants was found to be more challenging to predict. This study is the most extensive comparisons of prediction methods using an entirely novel set of variants never tested before.
format Online
Article
Text
id pubmed-8921618
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-89216182022-03-15 Predicting protein stability changes upon single-point mutation: a thorough comparison of the available tools on a new dataset Pancotti, Corrado Benevenuta, Silvia Birolo, Giovanni Alberini, Virginia Repetto, Valeria Sanavia, Tiziana Capriotti, Emidio Fariselli, Piero Brief Bioinform Problem Solving Protocol Predicting the difference in thermodynamic stability between protein variants is crucial for protein design and understanding the genotype-phenotype relationships. So far, several computational tools have been created to address this task. Nevertheless, most of them have been trained or optimized on the same and ‘all’ available data, making a fair comparison unfeasible. Here, we introduce a novel dataset, collected and manually cleaned from the latest version of the ThermoMutDB database, consisting of 669 variants not included in the most widely used training datasets. The prediction performance and the ability to satisfy the antisymmetry property by considering both direct and reverse variants were evaluated across 21 different tools. The Pearson correlations of the tested tools were in the ranges of 0.21–0.5 and 0–0.45 for the direct and reverse variants, respectively. When both direct and reverse variants are considered, the antisymmetric methods perform better achieving a Pearson correlation in the range of 0.51–0.62. The tested methods seem relatively insensitive to the physiological conditions, performing well also on the variants measured with more extreme pH and temperature values. A common issue with all the tested methods is the compression of the [Formula: see text] predictions toward zero. Furthermore, the thermodynamic stability of the most significantly stabilizing variants was found to be more challenging to predict. This study is the most extensive comparisons of prediction methods using an entirely novel set of variants never tested before. Oxford University Press 2022-01-11 /pmc/articles/PMC8921618/ /pubmed/35021190 http://dx.doi.org/10.1093/bib/bbab555 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Problem Solving Protocol
Pancotti, Corrado
Benevenuta, Silvia
Birolo, Giovanni
Alberini, Virginia
Repetto, Valeria
Sanavia, Tiziana
Capriotti, Emidio
Fariselli, Piero
Predicting protein stability changes upon single-point mutation: a thorough comparison of the available tools on a new dataset
title Predicting protein stability changes upon single-point mutation: a thorough comparison of the available tools on a new dataset
title_full Predicting protein stability changes upon single-point mutation: a thorough comparison of the available tools on a new dataset
title_fullStr Predicting protein stability changes upon single-point mutation: a thorough comparison of the available tools on a new dataset
title_full_unstemmed Predicting protein stability changes upon single-point mutation: a thorough comparison of the available tools on a new dataset
title_short Predicting protein stability changes upon single-point mutation: a thorough comparison of the available tools on a new dataset
title_sort predicting protein stability changes upon single-point mutation: a thorough comparison of the available tools on a new dataset
topic Problem Solving Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8921618/
https://www.ncbi.nlm.nih.gov/pubmed/35021190
http://dx.doi.org/10.1093/bib/bbab555
work_keys_str_mv AT pancotticorrado predictingproteinstabilitychangesuponsinglepointmutationathoroughcomparisonoftheavailabletoolsonanewdataset
AT benevenutasilvia predictingproteinstabilitychangesuponsinglepointmutationathoroughcomparisonoftheavailabletoolsonanewdataset
AT birologiovanni predictingproteinstabilitychangesuponsinglepointmutationathoroughcomparisonoftheavailabletoolsonanewdataset
AT alberinivirginia predictingproteinstabilitychangesuponsinglepointmutationathoroughcomparisonoftheavailabletoolsonanewdataset
AT repettovaleria predictingproteinstabilitychangesuponsinglepointmutationathoroughcomparisonoftheavailabletoolsonanewdataset
AT sanaviatiziana predictingproteinstabilitychangesuponsinglepointmutationathoroughcomparisonoftheavailabletoolsonanewdataset
AT capriottiemidio predictingproteinstabilitychangesuponsinglepointmutationathoroughcomparisonoftheavailabletoolsonanewdataset
AT farisellipiero predictingproteinstabilitychangesuponsinglepointmutationathoroughcomparisonoftheavailabletoolsonanewdataset