Cargando…
Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments
BACKGROUND: β-turn is a secondary protein structure type that plays significant role in protein folding, stability, and molecular recognition. To date, several methods for prediction of β-turns from protein sequences were developed, but they are characterized by relatively poor prediction quality. T...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2613158/ https://www.ncbi.nlm.nih.gov/pubmed/18847492 http://dx.doi.org/10.1186/1471-2105-9-430 |
_version_ | 1782163161561956352 |
---|---|
author | Zheng, Ce Kurgan, Lukasz |
author_facet | Zheng, Ce Kurgan, Lukasz |
author_sort | Zheng, Ce |
collection | PubMed |
description | BACKGROUND: β-turn is a secondary protein structure type that plays significant role in protein folding, stability, and molecular recognition. To date, several methods for prediction of β-turns from protein sequences were developed, but they are characterized by relatively poor prediction quality. The novelty of the proposed sequence-based β-turn predictor stems from the usage of a window based information extracted from four predicted three-state secondary structures, which together with a selected set of position specific scoring matrix (PSSM) values serve as an input to the support vector machine (SVM) predictor. RESULTS: We show that (1) all four predicted secondary structures are useful; (2) the most useful information extracted from the predicted secondary structure includes the structure of the predicted residue, secondary structure content in a window around the predicted residue, and features that indicate whether the predicted residue is inside a secondary structure segment; (3) the PSSM values of Asn, Asp, Gly, Ile, Leu, Met, Pro, and Val were among the top ranked features, which corroborates with recent studies. The Asn, Asp, Gly, and Pro indicate potential β-turns, while the remaining four amino acids are useful to predict non-β-turns. Empirical evaluation using three nonredundant datasets shows favorable Q(total), Q(predicted )and MCC values when compared with over a dozen of modern competing methods. Our method is the first to break the 80% Q(total )barrier and achieves Q(total )= 80.9%, MCC = 0.47, and Q(predicted )higher by over 6% when compared with the second best method. We use feature selection to reduce the dimensionality of the feature vector used as the input for the proposed prediction method. The applied feature set is smaller by 86, 62 and 37% when compared with the second and two third-best (with respect to MCC) competing methods, respectively. CONCLUSION: Experiments show that the proposed method constitutes an improvement over the competing prediction methods. The proposed prediction model can better discriminate between β-turns and non-β-turns due to obtaining lower numbers of false positive predictions. The prediction model and datasets are freely available at . |
format | Text |
id | pubmed-2613158 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-26131582009-01-12 Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments Zheng, Ce Kurgan, Lukasz BMC Bioinformatics Methodology Article BACKGROUND: β-turn is a secondary protein structure type that plays significant role in protein folding, stability, and molecular recognition. To date, several methods for prediction of β-turns from protein sequences were developed, but they are characterized by relatively poor prediction quality. The novelty of the proposed sequence-based β-turn predictor stems from the usage of a window based information extracted from four predicted three-state secondary structures, which together with a selected set of position specific scoring matrix (PSSM) values serve as an input to the support vector machine (SVM) predictor. RESULTS: We show that (1) all four predicted secondary structures are useful; (2) the most useful information extracted from the predicted secondary structure includes the structure of the predicted residue, secondary structure content in a window around the predicted residue, and features that indicate whether the predicted residue is inside a secondary structure segment; (3) the PSSM values of Asn, Asp, Gly, Ile, Leu, Met, Pro, and Val were among the top ranked features, which corroborates with recent studies. The Asn, Asp, Gly, and Pro indicate potential β-turns, while the remaining four amino acids are useful to predict non-β-turns. Empirical evaluation using three nonredundant datasets shows favorable Q(total), Q(predicted )and MCC values when compared with over a dozen of modern competing methods. Our method is the first to break the 80% Q(total )barrier and achieves Q(total )= 80.9%, MCC = 0.47, and Q(predicted )higher by over 6% when compared with the second best method. We use feature selection to reduce the dimensionality of the feature vector used as the input for the proposed prediction method. The applied feature set is smaller by 86, 62 and 37% when compared with the second and two third-best (with respect to MCC) competing methods, respectively. CONCLUSION: Experiments show that the proposed method constitutes an improvement over the competing prediction methods. The proposed prediction model can better discriminate between β-turns and non-β-turns due to obtaining lower numbers of false positive predictions. The prediction model and datasets are freely available at . BioMed Central 2008-10-10 /pmc/articles/PMC2613158/ /pubmed/18847492 http://dx.doi.org/10.1186/1471-2105-9-430 Text en Copyright © 2008 Zheng and Kurgan; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Zheng, Ce Kurgan, Lukasz Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments |
title | Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments |
title_full | Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments |
title_fullStr | Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments |
title_full_unstemmed | Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments |
title_short | Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments |
title_sort | prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2613158/ https://www.ncbi.nlm.nih.gov/pubmed/18847492 http://dx.doi.org/10.1186/1471-2105-9-430 |
work_keys_str_mv | AT zhengce predictionofbetaturnsatover80accuracybasedonanensembleofpredictedsecondarystructuresandmultiplealignments AT kurganlukasz predictionofbetaturnsatover80accuracybasedonanensembleofpredictedsecondarystructuresandmultiplealignments |