Cargando…

Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments

BACKGROUND: β-turn is a secondary protein structure type that plays significant role in protein folding, stability, and molecular recognition. To date, several methods for prediction of β-turns from protein sequences were developed, but they are characterized by relatively poor prediction quality. T...

Descripción completa

Detalles Bibliográficos
Autores principales: Zheng, Ce, Kurgan, Lukasz
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2613158/
https://www.ncbi.nlm.nih.gov/pubmed/18847492
http://dx.doi.org/10.1186/1471-2105-9-430
_version_ 1782163161561956352
author Zheng, Ce
Kurgan, Lukasz
author_facet Zheng, Ce
Kurgan, Lukasz
author_sort Zheng, Ce
collection PubMed
description BACKGROUND: β-turn is a secondary protein structure type that plays significant role in protein folding, stability, and molecular recognition. To date, several methods for prediction of β-turns from protein sequences were developed, but they are characterized by relatively poor prediction quality. The novelty of the proposed sequence-based β-turn predictor stems from the usage of a window based information extracted from four predicted three-state secondary structures, which together with a selected set of position specific scoring matrix (PSSM) values serve as an input to the support vector machine (SVM) predictor. RESULTS: We show that (1) all four predicted secondary structures are useful; (2) the most useful information extracted from the predicted secondary structure includes the structure of the predicted residue, secondary structure content in a window around the predicted residue, and features that indicate whether the predicted residue is inside a secondary structure segment; (3) the PSSM values of Asn, Asp, Gly, Ile, Leu, Met, Pro, and Val were among the top ranked features, which corroborates with recent studies. The Asn, Asp, Gly, and Pro indicate potential β-turns, while the remaining four amino acids are useful to predict non-β-turns. Empirical evaluation using three nonredundant datasets shows favorable Q(total), Q(predicted )and MCC values when compared with over a dozen of modern competing methods. Our method is the first to break the 80% Q(total )barrier and achieves Q(total )= 80.9%, MCC = 0.47, and Q(predicted )higher by over 6% when compared with the second best method. We use feature selection to reduce the dimensionality of the feature vector used as the input for the proposed prediction method. The applied feature set is smaller by 86, 62 and 37% when compared with the second and two third-best (with respect to MCC) competing methods, respectively. CONCLUSION: Experiments show that the proposed method constitutes an improvement over the competing prediction methods. The proposed prediction model can better discriminate between β-turns and non-β-turns due to obtaining lower numbers of false positive predictions. The prediction model and datasets are freely available at .
format Text
id pubmed-2613158
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26131582009-01-12 Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments Zheng, Ce Kurgan, Lukasz BMC Bioinformatics Methodology Article BACKGROUND: β-turn is a secondary protein structure type that plays significant role in protein folding, stability, and molecular recognition. To date, several methods for prediction of β-turns from protein sequences were developed, but they are characterized by relatively poor prediction quality. The novelty of the proposed sequence-based β-turn predictor stems from the usage of a window based information extracted from four predicted three-state secondary structures, which together with a selected set of position specific scoring matrix (PSSM) values serve as an input to the support vector machine (SVM) predictor. RESULTS: We show that (1) all four predicted secondary structures are useful; (2) the most useful information extracted from the predicted secondary structure includes the structure of the predicted residue, secondary structure content in a window around the predicted residue, and features that indicate whether the predicted residue is inside a secondary structure segment; (3) the PSSM values of Asn, Asp, Gly, Ile, Leu, Met, Pro, and Val were among the top ranked features, which corroborates with recent studies. The Asn, Asp, Gly, and Pro indicate potential β-turns, while the remaining four amino acids are useful to predict non-β-turns. Empirical evaluation using three nonredundant datasets shows favorable Q(total), Q(predicted )and MCC values when compared with over a dozen of modern competing methods. Our method is the first to break the 80% Q(total )barrier and achieves Q(total )= 80.9%, MCC = 0.47, and Q(predicted )higher by over 6% when compared with the second best method. We use feature selection to reduce the dimensionality of the feature vector used as the input for the proposed prediction method. The applied feature set is smaller by 86, 62 and 37% when compared with the second and two third-best (with respect to MCC) competing methods, respectively. CONCLUSION: Experiments show that the proposed method constitutes an improvement over the competing prediction methods. The proposed prediction model can better discriminate between β-turns and non-β-turns due to obtaining lower numbers of false positive predictions. The prediction model and datasets are freely available at . BioMed Central 2008-10-10 /pmc/articles/PMC2613158/ /pubmed/18847492 http://dx.doi.org/10.1186/1471-2105-9-430 Text en Copyright © 2008 Zheng and Kurgan; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Zheng, Ce
Kurgan, Lukasz
Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments
title Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments
title_full Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments
title_fullStr Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments
title_full_unstemmed Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments
title_short Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments
title_sort prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2613158/
https://www.ncbi.nlm.nih.gov/pubmed/18847492
http://dx.doi.org/10.1186/1471-2105-9-430
work_keys_str_mv AT zhengce predictionofbetaturnsatover80accuracybasedonanensembleofpredictedsecondarystructuresandmultiplealignments
AT kurganlukasz predictionofbetaturnsatover80accuracybasedonanensembleofpredictedsecondarystructuresandmultiplealignments