Cargando…
Feature-based multiple models improve classification of mutation-induced stability changes
BACKGROUND: Reliable prediction of stability changes in protein variants is an important aspect of computational protein design. A number of machine learning methods that allow a classification of stability changes knowing only the sequence of the protein emerged. However, their performance on amino...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4083411/ https://www.ncbi.nlm.nih.gov/pubmed/25057118 http://dx.doi.org/10.1186/1471-2164-15-S4-S6 |
_version_ | 1782324374974496768 |
---|---|
author | Folkman, Lukas Stantic, Bela Sattar, Abdul |
author_facet | Folkman, Lukas Stantic, Bela Sattar, Abdul |
author_sort | Folkman, Lukas |
collection | PubMed |
description | BACKGROUND: Reliable prediction of stability changes in protein variants is an important aspect of computational protein design. A number of machine learning methods that allow a classification of stability changes knowing only the sequence of the protein emerged. However, their performance on amino acid substitutions of previously unseen non-homologous proteins is rather limited. Moreover, the performance varies for different types of mutations based on the secondary structure or accessible surface area of the mutation site. RESULTS: We proposed feature-based multiple models with each model designed for a specific type of mutations. The new method is composed of five models trained for mutations in exposed, buried, helical, sheet, and coil residues. The classification of a mutation as stabilising or destabilising is made as a consensus of two models, one selected based on the predicted accessible surface area and the other based on the predicted secondary structure of the mutation site. We refer to our new method as Evolutionary, Amino acid, and Structural Encodings with Multiple Models (EASE-MM). Cross-validation results show that EASE-MM provides a notable improvement to our previous work reaching a Matthews correlation coefficient of 0.44. EASE-MM was able to correctly classify 73% and 75% of stabilising and destabilising protein variants, respectively. Using an independent test set of 238 mutations, we confirmed our results in a comparison with related work. CONCLUSIONS: EASE-MM not only outperformed other related methods but achieved more balanced results for different types of mutations based on the accessible surface area, secondary structure, or magnitude of stability changes. This can be attributed to using multiple models with the most relevant features selected for the given type of mutations. Therefore, our results support the presumption that different interactions govern stability changes in the exposed and buried residues or in residues with a different secondary structure. |
format | Online Article Text |
id | pubmed-4083411 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-40834112014-07-18 Feature-based multiple models improve classification of mutation-induced stability changes Folkman, Lukas Stantic, Bela Sattar, Abdul BMC Genomics Research BACKGROUND: Reliable prediction of stability changes in protein variants is an important aspect of computational protein design. A number of machine learning methods that allow a classification of stability changes knowing only the sequence of the protein emerged. However, their performance on amino acid substitutions of previously unseen non-homologous proteins is rather limited. Moreover, the performance varies for different types of mutations based on the secondary structure or accessible surface area of the mutation site. RESULTS: We proposed feature-based multiple models with each model designed for a specific type of mutations. The new method is composed of five models trained for mutations in exposed, buried, helical, sheet, and coil residues. The classification of a mutation as stabilising or destabilising is made as a consensus of two models, one selected based on the predicted accessible surface area and the other based on the predicted secondary structure of the mutation site. We refer to our new method as Evolutionary, Amino acid, and Structural Encodings with Multiple Models (EASE-MM). Cross-validation results show that EASE-MM provides a notable improvement to our previous work reaching a Matthews correlation coefficient of 0.44. EASE-MM was able to correctly classify 73% and 75% of stabilising and destabilising protein variants, respectively. Using an independent test set of 238 mutations, we confirmed our results in a comparison with related work. CONCLUSIONS: EASE-MM not only outperformed other related methods but achieved more balanced results for different types of mutations based on the accessible surface area, secondary structure, or magnitude of stability changes. This can be attributed to using multiple models with the most relevant features selected for the given type of mutations. Therefore, our results support the presumption that different interactions govern stability changes in the exposed and buried residues or in residues with a different secondary structure. BioMed Central 2014-05-20 /pmc/articles/PMC4083411/ /pubmed/25057118 http://dx.doi.org/10.1186/1471-2164-15-S4-S6 Text en Copyright © 2014 Folkman et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Folkman, Lukas Stantic, Bela Sattar, Abdul Feature-based multiple models improve classification of mutation-induced stability changes |
title | Feature-based multiple models improve classification of mutation-induced stability changes |
title_full | Feature-based multiple models improve classification of mutation-induced stability changes |
title_fullStr | Feature-based multiple models improve classification of mutation-induced stability changes |
title_full_unstemmed | Feature-based multiple models improve classification of mutation-induced stability changes |
title_short | Feature-based multiple models improve classification of mutation-induced stability changes |
title_sort | feature-based multiple models improve classification of mutation-induced stability changes |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4083411/ https://www.ncbi.nlm.nih.gov/pubmed/25057118 http://dx.doi.org/10.1186/1471-2164-15-S4-S6 |
work_keys_str_mv | AT folkmanlukas featurebasedmultiplemodelsimproveclassificationofmutationinducedstabilitychanges AT stanticbela featurebasedmultiplemodelsimproveclassificationofmutationinducedstabilitychanges AT sattarabdul featurebasedmultiplemodelsimproveclassificationofmutationinducedstabilitychanges |