Cargando…

PSSP-RFE: Accurate Prediction of Protein Structural Class by Recursive Feature Extraction from PSI-BLAST Profile, Physical-Chemical Property and Functional Annotations

Protein structure prediction is critical to functional annotation of the massively accumulated biological sequences, which prompts an imperative need for the development of high-throughput technologies. As a first and key step in protein structure prediction, protein structural class prediction beco...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Liqi, Cui, Xiang, Yu, Sanjiu, Zhang, Yuan, Luo, Zhong, Yang, Hua, Zhou, Yue, Zheng, Xiaoqi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3968047/
https://www.ncbi.nlm.nih.gov/pubmed/24675610
http://dx.doi.org/10.1371/journal.pone.0092863
_version_ 1782309104466788352
author Li, Liqi
Cui, Xiang
Yu, Sanjiu
Zhang, Yuan
Luo, Zhong
Yang, Hua
Zhou, Yue
Zheng, Xiaoqi
author_facet Li, Liqi
Cui, Xiang
Yu, Sanjiu
Zhang, Yuan
Luo, Zhong
Yang, Hua
Zhou, Yue
Zheng, Xiaoqi
author_sort Li, Liqi
collection PubMed
description Protein structure prediction is critical to functional annotation of the massively accumulated biological sequences, which prompts an imperative need for the development of high-throughput technologies. As a first and key step in protein structure prediction, protein structural class prediction becomes an increasingly challenging task. Amongst most homological-based approaches, the accuracies of protein structural class prediction are sufficiently high for high similarity datasets, but still far from being satisfactory for low similarity datasets, i.e., below 40% in pairwise sequence similarity. Therefore, we present a novel method for accurate and reliable protein structural class prediction for both high and low similarity datasets. This method is based on Support Vector Machine (SVM) in conjunction with integrated features from position-specific score matrix (PSSM), PROFEAT and Gene Ontology (GO). A feature selection approach, SVM-RFE, is also used to rank the integrated feature vectors through recursively removing the feature with the lowest ranking score. The definitive top features selected by SVM-RFE are input into the SVM engines to predict the structural class of a query protein. To validate our method, jackknife tests were applied to seven widely used benchmark datasets, reaching overall accuracies between 84.61% and 99.79%, which are significantly higher than those achieved by state-of-the-art tools. These results suggest that our method could serve as an accurate and cost-effective alternative to existing methods in protein structural classification, especially for low similarity datasets.
format Online
Article
Text
id pubmed-3968047
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-39680472014-04-01 PSSP-RFE: Accurate Prediction of Protein Structural Class by Recursive Feature Extraction from PSI-BLAST Profile, Physical-Chemical Property and Functional Annotations Li, Liqi Cui, Xiang Yu, Sanjiu Zhang, Yuan Luo, Zhong Yang, Hua Zhou, Yue Zheng, Xiaoqi PLoS One Research Article Protein structure prediction is critical to functional annotation of the massively accumulated biological sequences, which prompts an imperative need for the development of high-throughput technologies. As a first and key step in protein structure prediction, protein structural class prediction becomes an increasingly challenging task. Amongst most homological-based approaches, the accuracies of protein structural class prediction are sufficiently high for high similarity datasets, but still far from being satisfactory for low similarity datasets, i.e., below 40% in pairwise sequence similarity. Therefore, we present a novel method for accurate and reliable protein structural class prediction for both high and low similarity datasets. This method is based on Support Vector Machine (SVM) in conjunction with integrated features from position-specific score matrix (PSSM), PROFEAT and Gene Ontology (GO). A feature selection approach, SVM-RFE, is also used to rank the integrated feature vectors through recursively removing the feature with the lowest ranking score. The definitive top features selected by SVM-RFE are input into the SVM engines to predict the structural class of a query protein. To validate our method, jackknife tests were applied to seven widely used benchmark datasets, reaching overall accuracies between 84.61% and 99.79%, which are significantly higher than those achieved by state-of-the-art tools. These results suggest that our method could serve as an accurate and cost-effective alternative to existing methods in protein structural classification, especially for low similarity datasets. Public Library of Science 2014-03-27 /pmc/articles/PMC3968047/ /pubmed/24675610 http://dx.doi.org/10.1371/journal.pone.0092863 Text en © 2014 Li et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Li, Liqi
Cui, Xiang
Yu, Sanjiu
Zhang, Yuan
Luo, Zhong
Yang, Hua
Zhou, Yue
Zheng, Xiaoqi
PSSP-RFE: Accurate Prediction of Protein Structural Class by Recursive Feature Extraction from PSI-BLAST Profile, Physical-Chemical Property and Functional Annotations
title PSSP-RFE: Accurate Prediction of Protein Structural Class by Recursive Feature Extraction from PSI-BLAST Profile, Physical-Chemical Property and Functional Annotations
title_full PSSP-RFE: Accurate Prediction of Protein Structural Class by Recursive Feature Extraction from PSI-BLAST Profile, Physical-Chemical Property and Functional Annotations
title_fullStr PSSP-RFE: Accurate Prediction of Protein Structural Class by Recursive Feature Extraction from PSI-BLAST Profile, Physical-Chemical Property and Functional Annotations
title_full_unstemmed PSSP-RFE: Accurate Prediction of Protein Structural Class by Recursive Feature Extraction from PSI-BLAST Profile, Physical-Chemical Property and Functional Annotations
title_short PSSP-RFE: Accurate Prediction of Protein Structural Class by Recursive Feature Extraction from PSI-BLAST Profile, Physical-Chemical Property and Functional Annotations
title_sort pssp-rfe: accurate prediction of protein structural class by recursive feature extraction from psi-blast profile, physical-chemical property and functional annotations
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3968047/
https://www.ncbi.nlm.nih.gov/pubmed/24675610
http://dx.doi.org/10.1371/journal.pone.0092863
work_keys_str_mv AT liliqi pssprfeaccuratepredictionofproteinstructuralclassbyrecursivefeatureextractionfrompsiblastprofilephysicalchemicalpropertyandfunctionalannotations
AT cuixiang pssprfeaccuratepredictionofproteinstructuralclassbyrecursivefeatureextractionfrompsiblastprofilephysicalchemicalpropertyandfunctionalannotations
AT yusanjiu pssprfeaccuratepredictionofproteinstructuralclassbyrecursivefeatureextractionfrompsiblastprofilephysicalchemicalpropertyandfunctionalannotations
AT zhangyuan pssprfeaccuratepredictionofproteinstructuralclassbyrecursivefeatureextractionfrompsiblastprofilephysicalchemicalpropertyandfunctionalannotations
AT luozhong pssprfeaccuratepredictionofproteinstructuralclassbyrecursivefeatureextractionfrompsiblastprofilephysicalchemicalpropertyandfunctionalannotations
AT yanghua pssprfeaccuratepredictionofproteinstructuralclassbyrecursivefeatureextractionfrompsiblastprofilephysicalchemicalpropertyandfunctionalannotations
AT zhouyue pssprfeaccuratepredictionofproteinstructuralclassbyrecursivefeatureextractionfrompsiblastprofilephysicalchemicalpropertyandfunctionalannotations
AT zhengxiaoqi pssprfeaccuratepredictionofproteinstructuralclassbyrecursivefeatureextractionfrompsiblastprofilephysicalchemicalpropertyandfunctionalannotations