Cargando…

Feature-based multiple models improve classification of mutation-induced stability changes

BACKGROUND: Reliable prediction of stability changes in protein variants is an important aspect of computational protein design. A number of machine learning methods that allow a classification of stability changes knowing only the sequence of the protein emerged. However, their performance on amino...

Descripción completa

Detalles Bibliográficos
Autores principales: Folkman, Lukas, Stantic, Bela, Sattar, Abdul
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4083411/
https://www.ncbi.nlm.nih.gov/pubmed/25057118
http://dx.doi.org/10.1186/1471-2164-15-S4-S6
_version_ 1782324374974496768
author Folkman, Lukas
Stantic, Bela
Sattar, Abdul
author_facet Folkman, Lukas
Stantic, Bela
Sattar, Abdul
author_sort Folkman, Lukas
collection PubMed
description BACKGROUND: Reliable prediction of stability changes in protein variants is an important aspect of computational protein design. A number of machine learning methods that allow a classification of stability changes knowing only the sequence of the protein emerged. However, their performance on amino acid substitutions of previously unseen non-homologous proteins is rather limited. Moreover, the performance varies for different types of mutations based on the secondary structure or accessible surface area of the mutation site. RESULTS: We proposed feature-based multiple models with each model designed for a specific type of mutations. The new method is composed of five models trained for mutations in exposed, buried, helical, sheet, and coil residues. The classification of a mutation as stabilising or destabilising is made as a consensus of two models, one selected based on the predicted accessible surface area and the other based on the predicted secondary structure of the mutation site. We refer to our new method as Evolutionary, Amino acid, and Structural Encodings with Multiple Models (EASE-MM). Cross-validation results show that EASE-MM provides a notable improvement to our previous work reaching a Matthews correlation coefficient of 0.44. EASE-MM was able to correctly classify 73% and 75% of stabilising and destabilising protein variants, respectively. Using an independent test set of 238 mutations, we confirmed our results in a comparison with related work. CONCLUSIONS: EASE-MM not only outperformed other related methods but achieved more balanced results for different types of mutations based on the accessible surface area, secondary structure, or magnitude of stability changes. This can be attributed to using multiple models with the most relevant features selected for the given type of mutations. Therefore, our results support the presumption that different interactions govern stability changes in the exposed and buried residues or in residues with a different secondary structure.
format Online
Article
Text
id pubmed-4083411
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40834112014-07-18 Feature-based multiple models improve classification of mutation-induced stability changes Folkman, Lukas Stantic, Bela Sattar, Abdul BMC Genomics Research BACKGROUND: Reliable prediction of stability changes in protein variants is an important aspect of computational protein design. A number of machine learning methods that allow a classification of stability changes knowing only the sequence of the protein emerged. However, their performance on amino acid substitutions of previously unseen non-homologous proteins is rather limited. Moreover, the performance varies for different types of mutations based on the secondary structure or accessible surface area of the mutation site. RESULTS: We proposed feature-based multiple models with each model designed for a specific type of mutations. The new method is composed of five models trained for mutations in exposed, buried, helical, sheet, and coil residues. The classification of a mutation as stabilising or destabilising is made as a consensus of two models, one selected based on the predicted accessible surface area and the other based on the predicted secondary structure of the mutation site. We refer to our new method as Evolutionary, Amino acid, and Structural Encodings with Multiple Models (EASE-MM). Cross-validation results show that EASE-MM provides a notable improvement to our previous work reaching a Matthews correlation coefficient of 0.44. EASE-MM was able to correctly classify 73% and 75% of stabilising and destabilising protein variants, respectively. Using an independent test set of 238 mutations, we confirmed our results in a comparison with related work. CONCLUSIONS: EASE-MM not only outperformed other related methods but achieved more balanced results for different types of mutations based on the accessible surface area, secondary structure, or magnitude of stability changes. This can be attributed to using multiple models with the most relevant features selected for the given type of mutations. Therefore, our results support the presumption that different interactions govern stability changes in the exposed and buried residues or in residues with a different secondary structure. BioMed Central 2014-05-20 /pmc/articles/PMC4083411/ /pubmed/25057118 http://dx.doi.org/10.1186/1471-2164-15-S4-S6 Text en Copyright © 2014 Folkman et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Folkman, Lukas
Stantic, Bela
Sattar, Abdul
Feature-based multiple models improve classification of mutation-induced stability changes
title Feature-based multiple models improve classification of mutation-induced stability changes
title_full Feature-based multiple models improve classification of mutation-induced stability changes
title_fullStr Feature-based multiple models improve classification of mutation-induced stability changes
title_full_unstemmed Feature-based multiple models improve classification of mutation-induced stability changes
title_short Feature-based multiple models improve classification of mutation-induced stability changes
title_sort feature-based multiple models improve classification of mutation-induced stability changes
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4083411/
https://www.ncbi.nlm.nih.gov/pubmed/25057118
http://dx.doi.org/10.1186/1471-2164-15-S4-S6
work_keys_str_mv AT folkmanlukas featurebasedmultiplemodelsimproveclassificationofmutationinducedstabilitychanges
AT stanticbela featurebasedmultiplemodelsimproveclassificationofmutationinducedstabilitychanges
AT sattarabdul featurebasedmultiplemodelsimproveclassificationofmutationinducedstabilitychanges