Cargando…
Prediction of Protein Structural Classes for Low-Similarity Sequences Based on Consensus Sequence and Segmented PSSM
Prediction of protein structural classes for low-similarity sequences is useful for understanding fold patterns, regulation, functions, and interactions of proteins. It is well known that feature extraction is significant to prediction of protein structural class and it mainly uses protein primary s...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi Publishing Corporation
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4693000/ https://www.ncbi.nlm.nih.gov/pubmed/26788119 http://dx.doi.org/10.1155/2015/370756 |
_version_ | 1782407301709168640 |
---|---|
author | Liang, Yunyun Liu, Sanyang Zhang, Shengli |
author_facet | Liang, Yunyun Liu, Sanyang Zhang, Shengli |
author_sort | Liang, Yunyun |
collection | PubMed |
description | Prediction of protein structural classes for low-similarity sequences is useful for understanding fold patterns, regulation, functions, and interactions of proteins. It is well known that feature extraction is significant to prediction of protein structural class and it mainly uses protein primary sequence, predicted secondary structure sequence, and position-specific scoring matrix (PSSM). Currently, prediction solely based on the PSSM has played a key role in improving the prediction accuracy. In this paper, we propose a novel method called CSP-SegPseP-SegACP by fusing consensus sequence (CS), segmented PsePSSM, and segmented autocovariance transformation (ACT) based on PSSM. Three widely used low-similarity datasets (1189, 25PDB, and 640) are adopted in this paper. Then a 700-dimensional (700D) feature vector is constructed and the dimension is decreased to 224D by using principal component analysis (PCA). To verify the performance of our method, rigorous jackknife cross-validation tests are performed on 1189, 25PDB, and 640 datasets. Comparison of our results with the existing PSSM-based methods demonstrates that our method achieves the favorable and competitive performance. This will offer an important complementary to other PSSM-based methods for prediction of protein structural classes for low-similarity sequences. |
format | Online Article Text |
id | pubmed-4693000 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Hindawi Publishing Corporation |
record_format | MEDLINE/PubMed |
spelling | pubmed-46930002016-01-19 Prediction of Protein Structural Classes for Low-Similarity Sequences Based on Consensus Sequence and Segmented PSSM Liang, Yunyun Liu, Sanyang Zhang, Shengli Comput Math Methods Med Research Article Prediction of protein structural classes for low-similarity sequences is useful for understanding fold patterns, regulation, functions, and interactions of proteins. It is well known that feature extraction is significant to prediction of protein structural class and it mainly uses protein primary sequence, predicted secondary structure sequence, and position-specific scoring matrix (PSSM). Currently, prediction solely based on the PSSM has played a key role in improving the prediction accuracy. In this paper, we propose a novel method called CSP-SegPseP-SegACP by fusing consensus sequence (CS), segmented PsePSSM, and segmented autocovariance transformation (ACT) based on PSSM. Three widely used low-similarity datasets (1189, 25PDB, and 640) are adopted in this paper. Then a 700-dimensional (700D) feature vector is constructed and the dimension is decreased to 224D by using principal component analysis (PCA). To verify the performance of our method, rigorous jackknife cross-validation tests are performed on 1189, 25PDB, and 640 datasets. Comparison of our results with the existing PSSM-based methods demonstrates that our method achieves the favorable and competitive performance. This will offer an important complementary to other PSSM-based methods for prediction of protein structural classes for low-similarity sequences. Hindawi Publishing Corporation 2015 2015-12-15 /pmc/articles/PMC4693000/ /pubmed/26788119 http://dx.doi.org/10.1155/2015/370756 Text en Copyright © 2015 Yunyun Liang et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Liang, Yunyun Liu, Sanyang Zhang, Shengli Prediction of Protein Structural Classes for Low-Similarity Sequences Based on Consensus Sequence and Segmented PSSM |
title | Prediction of Protein Structural Classes for Low-Similarity Sequences Based on Consensus Sequence and Segmented PSSM |
title_full | Prediction of Protein Structural Classes for Low-Similarity Sequences Based on Consensus Sequence and Segmented PSSM |
title_fullStr | Prediction of Protein Structural Classes for Low-Similarity Sequences Based on Consensus Sequence and Segmented PSSM |
title_full_unstemmed | Prediction of Protein Structural Classes for Low-Similarity Sequences Based on Consensus Sequence and Segmented PSSM |
title_short | Prediction of Protein Structural Classes for Low-Similarity Sequences Based on Consensus Sequence and Segmented PSSM |
title_sort | prediction of protein structural classes for low-similarity sequences based on consensus sequence and segmented pssm |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4693000/ https://www.ncbi.nlm.nih.gov/pubmed/26788119 http://dx.doi.org/10.1155/2015/370756 |
work_keys_str_mv | AT liangyunyun predictionofproteinstructuralclassesforlowsimilaritysequencesbasedonconsensussequenceandsegmentedpssm AT liusanyang predictionofproteinstructuralclassesforlowsimilaritysequencesbasedonconsensussequenceandsegmentedpssm AT zhangshengli predictionofproteinstructuralclassesforlowsimilaritysequencesbasedonconsensussequenceandsegmentedpssm |