Cargando…

Grading amino acid properties increased accuracies of single point mutation on protein stability prediction

BACKGROUND: Protein stabilities can be affected sometimes by point mutations introduced to the protein. Current sequence-information-based protein stability prediction encoding schemes of machine learning approaches include sparse encoding and amino acid property encoding. Property encoding schemes...

Descripción completa

Detalles Bibliográficos
Autores principales:	Liu, Jianguo, Kang, Xianjiang
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2012
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3820156/ https://www.ncbi.nlm.nih.gov/pubmed/22435732 http://dx.doi.org/10.1186/1471-2105-13-44

_version_	1782290092945047552
author	Liu, Jianguo Kang, Xianjiang
author_facet	Liu, Jianguo Kang, Xianjiang
author_sort	Liu, Jianguo
collection	PubMed
description	BACKGROUND: Protein stabilities can be affected sometimes by point mutations introduced to the protein. Current sequence-information-based protein stability prediction encoding schemes of machine learning approaches include sparse encoding and amino acid property encoding. Property encoding schemes employ physical-chemical information of the mutated protein environments, however, they produce complexity in the mean time when many properties joined in the scheme. The complexity introduces noises that affect machine learning algorithm accuracies. In order to overcome the problem we described a new encoding scheme that graded twenty amino acids into groups according to their specific property values. RESULTS: We employed three predefined values, 0.1, 0.5, and 0.9 to represent 'weak', 'middle', and 'strong' groups for each amino acid property, and introduced two thresholds for each property to split twenty amino acids into one of the three groups according to their property values. Each amino acid can take only one out of three predefined values rather than twenty different values for each property. The complexity and noises in the encoding schemes were reduced in this way. More than 7% average accuracy improvement was found in the graded amino acid property encoding schemes by 20-fold cross validation. The overall accuracy of our method is more than 72% when performed on the independent test sets starting from sequence information with three-state prediction definitions. CONCLUSIONS: Grading numeric values of amino acid property can reduce the noises and complexity of input information. It is in accordance with biochemical concepts for amino acid properties and makes the input data simplified in the mean time. The idea of graded property encoding schemes may be applied to protein related predictions with machine learning approaches.
format	Online Article Text
id	pubmed-3820156
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-38201562013-11-11 Grading amino acid properties increased accuracies of single point mutation on protein stability prediction Liu, Jianguo Kang, Xianjiang BMC Bioinformatics Methodology Article BACKGROUND: Protein stabilities can be affected sometimes by point mutations introduced to the protein. Current sequence-information-based protein stability prediction encoding schemes of machine learning approaches include sparse encoding and amino acid property encoding. Property encoding schemes employ physical-chemical information of the mutated protein environments, however, they produce complexity in the mean time when many properties joined in the scheme. The complexity introduces noises that affect machine learning algorithm accuracies. In order to overcome the problem we described a new encoding scheme that graded twenty amino acids into groups according to their specific property values. RESULTS: We employed three predefined values, 0.1, 0.5, and 0.9 to represent 'weak', 'middle', and 'strong' groups for each amino acid property, and introduced two thresholds for each property to split twenty amino acids into one of the three groups according to their property values. Each amino acid can take only one out of three predefined values rather than twenty different values for each property. The complexity and noises in the encoding schemes were reduced in this way. More than 7% average accuracy improvement was found in the graded amino acid property encoding schemes by 20-fold cross validation. The overall accuracy of our method is more than 72% when performed on the independent test sets starting from sequence information with three-state prediction definitions. CONCLUSIONS: Grading numeric values of amino acid property can reduce the noises and complexity of input information. It is in accordance with biochemical concepts for amino acid properties and makes the input data simplified in the mean time. The idea of graded property encoding schemes may be applied to protein related predictions with machine learning approaches. BioMed Central 2012-03-22 /pmc/articles/PMC3820156/ /pubmed/22435732 http://dx.doi.org/10.1186/1471-2105-13-44 Text en Copyright © 2012 Liu and Kang; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Liu, Jianguo Kang, Xianjiang Grading amino acid properties increased accuracies of single point mutation on protein stability prediction
title	Grading amino acid properties increased accuracies of single point mutation on protein stability prediction
title_full	Grading amino acid properties increased accuracies of single point mutation on protein stability prediction
title_fullStr	Grading amino acid properties increased accuracies of single point mutation on protein stability prediction
title_full_unstemmed	Grading amino acid properties increased accuracies of single point mutation on protein stability prediction
title_short	Grading amino acid properties increased accuracies of single point mutation on protein stability prediction
title_sort	grading amino acid properties increased accuracies of single point mutation on protein stability prediction
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3820156/ https://www.ncbi.nlm.nih.gov/pubmed/22435732 http://dx.doi.org/10.1186/1471-2105-13-44
work_keys_str_mv	AT liujianguo gradingaminoacidpropertiesincreasedaccuraciesofsinglepointmutationonproteinstabilityprediction AT kangxianjiang gradingaminoacidpropertiesincreasedaccuraciesofsinglepointmutationonproteinstabilityprediction

Grading amino acid properties increased accuracies of single point mutation on protein stability prediction

Ejemplares similares