Cargando…

Enhancing protein-vitamin binding residues prediction by multiple heterogeneous subspace SVMs ensemble

BACKGROUND: Vitamins are typical ligands that play critical roles in various metabolic processes. The accurate identification of the vitamin-binding residues solely based on a protein sequence is of significant importance for the functional annotation of proteins, especially in the post-genomic era,...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Dong-Jun, Hu, Jun, Yan, Hui, Yang, Xi-Bei, Yang, Jing-Yu, Shen, Hong-Bin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4261549/
https://www.ncbi.nlm.nih.gov/pubmed/25189131
http://dx.doi.org/10.1186/1471-2105-15-297
Descripción
Sumario:BACKGROUND: Vitamins are typical ligands that play critical roles in various metabolic processes. The accurate identification of the vitamin-binding residues solely based on a protein sequence is of significant importance for the functional annotation of proteins, especially in the post-genomic era, when large volumes of protein sequences are accumulating quickly without being functionally annotated. RESULTS: In this paper, a new predictor called TargetVita is designed and implemented for predicting protein-vitamin binding residues using protein sequences. In TargetVita, features derived from the position-specific scoring matrix (PSSM), predicted protein secondary structure, and vitamin binding propensity are combined to form the original feature space; then, several feature subspaces are selected by performing different feature selection methods. Finally, based on the selected feature subspaces, heterogeneous SVMs are trained and then ensembled for performing prediction. CONCLUSIONS: The experimental results obtained with four separate vitamin-binding benchmark datasets demonstrate that the proposed TargetVita is superior to the state-of-the-art vitamin-specific predictor, and an average improvement of 10% in terms of the Matthews correlation coefficient (MCC) was achieved over independent validation tests. The TargetVita web server and the datasets used are freely available for academic use at http://csbio.njust.edu.cn/bioinf/TargetVita or http://www.csbio.sjtu.edu.cn/bioinf/TargetVita. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2105-15-297) contains supplementary material, which is available to authorized users.