Cargando…

PSP_MCSVM: brainstorming consensus prediction of protein secondary structures using two-stage multiclass support vector machines

Secondary structure prediction is a crucial task for understanding the variety of protein structures and performed biological functions. Prediction of secondary structures for new proteins using their amino acid sequences is of fundamental importance in bioinformatics. We propose a novel technique t...

Descripción completa

Detalles Bibliográficos
Autores principales: Chatterjee, Piyali, Basu, Subhadip, Kundu, Mahantapas, Nasipuri, Mita, Plewczynski, Dariusz
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer-Verlag 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168739/
https://www.ncbi.nlm.nih.gov/pubmed/21594694
http://dx.doi.org/10.1007/s00894-011-1102-8
_version_ 1782211416083660800
author Chatterjee, Piyali
Basu, Subhadip
Kundu, Mahantapas
Nasipuri, Mita
Plewczynski, Dariusz
author_facet Chatterjee, Piyali
Basu, Subhadip
Kundu, Mahantapas
Nasipuri, Mita
Plewczynski, Dariusz
author_sort Chatterjee, Piyali
collection PubMed
description Secondary structure prediction is a crucial task for understanding the variety of protein structures and performed biological functions. Prediction of secondary structures for new proteins using their amino acid sequences is of fundamental importance in bioinformatics. We propose a novel technique to predict protein secondary structures based on position-specific scoring matrices (PSSMs) and physico-chemical properties of amino acids. It is a two stage approach involving multiclass support vector machines (SVMs) as classifiers for three different structural conformations, viz., helix, sheet and coil. In the first stage, PSSMs obtained from PSI-BLAST and five specially selected physicochemical properties of amino acids are fed into SVMs as features for sequence-to-structure prediction. Confidence values for forming helix, sheet and coil that are obtained from the first stage SVM are then used in the second stage SVM for performing structure-to-structure prediction. The two-stage cascaded classifiers (PSP_MCSVM) are trained with proteins from RS126 dataset. The classifiers are finally tested on target proteins of critical assessment of protein structure prediction experiment-9 (CASP9). PSP_MCSVM with brainstorming consensus procedure performs better than the prediction servers like Predator, DSC, SIMPA96, for randomly selected proteins from CASP9 targets. The overall performance is found to be comparable with the current state-of-the art. PSP_MCSVM source code, train-test datasets and supplementary files are available freely in public domain at: http://sysbio.icm.edu.pl/secstruct and http://code.google.com/p/cmater-bioinfo/ ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s00894-011-1102-8) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-3168739
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Springer-Verlag
record_format MEDLINE/PubMed
spelling pubmed-31687392011-09-26 PSP_MCSVM: brainstorming consensus prediction of protein secondary structures using two-stage multiclass support vector machines Chatterjee, Piyali Basu, Subhadip Kundu, Mahantapas Nasipuri, Mita Plewczynski, Dariusz J Mol Model Original Paper Secondary structure prediction is a crucial task for understanding the variety of protein structures and performed biological functions. Prediction of secondary structures for new proteins using their amino acid sequences is of fundamental importance in bioinformatics. We propose a novel technique to predict protein secondary structures based on position-specific scoring matrices (PSSMs) and physico-chemical properties of amino acids. It is a two stage approach involving multiclass support vector machines (SVMs) as classifiers for three different structural conformations, viz., helix, sheet and coil. In the first stage, PSSMs obtained from PSI-BLAST and five specially selected physicochemical properties of amino acids are fed into SVMs as features for sequence-to-structure prediction. Confidence values for forming helix, sheet and coil that are obtained from the first stage SVM are then used in the second stage SVM for performing structure-to-structure prediction. The two-stage cascaded classifiers (PSP_MCSVM) are trained with proteins from RS126 dataset. The classifiers are finally tested on target proteins of critical assessment of protein structure prediction experiment-9 (CASP9). PSP_MCSVM with brainstorming consensus procedure performs better than the prediction servers like Predator, DSC, SIMPA96, for randomly selected proteins from CASP9 targets. The overall performance is found to be comparable with the current state-of-the art. PSP_MCSVM source code, train-test datasets and supplementary files are available freely in public domain at: http://sysbio.icm.edu.pl/secstruct and http://code.google.com/p/cmater-bioinfo/ ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s00894-011-1102-8) contains supplementary material, which is available to authorized users. Springer-Verlag 2011-05-19 2011 /pmc/articles/PMC3168739/ /pubmed/21594694 http://dx.doi.org/10.1007/s00894-011-1102-8 Text en © The Author(s) 2011 https://creativecommons.org/licenses/by-nc/4.0/ This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
spellingShingle Original Paper
Chatterjee, Piyali
Basu, Subhadip
Kundu, Mahantapas
Nasipuri, Mita
Plewczynski, Dariusz
PSP_MCSVM: brainstorming consensus prediction of protein secondary structures using two-stage multiclass support vector machines
title PSP_MCSVM: brainstorming consensus prediction of protein secondary structures using two-stage multiclass support vector machines
title_full PSP_MCSVM: brainstorming consensus prediction of protein secondary structures using two-stage multiclass support vector machines
title_fullStr PSP_MCSVM: brainstorming consensus prediction of protein secondary structures using two-stage multiclass support vector machines
title_full_unstemmed PSP_MCSVM: brainstorming consensus prediction of protein secondary structures using two-stage multiclass support vector machines
title_short PSP_MCSVM: brainstorming consensus prediction of protein secondary structures using two-stage multiclass support vector machines
title_sort psp_mcsvm: brainstorming consensus prediction of protein secondary structures using two-stage multiclass support vector machines
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168739/
https://www.ncbi.nlm.nih.gov/pubmed/21594694
http://dx.doi.org/10.1007/s00894-011-1102-8
work_keys_str_mv AT chatterjeepiyali pspmcsvmbrainstormingconsensuspredictionofproteinsecondarystructuresusingtwostagemulticlasssupportvectormachines
AT basusubhadip pspmcsvmbrainstormingconsensuspredictionofproteinsecondarystructuresusingtwostagemulticlasssupportvectormachines
AT kundumahantapas pspmcsvmbrainstormingconsensuspredictionofproteinsecondarystructuresusingtwostagemulticlasssupportvectormachines
AT nasipurimita pspmcsvmbrainstormingconsensuspredictionofproteinsecondarystructuresusingtwostagemulticlasssupportvectormachines
AT plewczynskidariusz pspmcsvmbrainstormingconsensuspredictionofproteinsecondarystructuresusingtwostagemulticlasssupportvectormachines