Cargando…

pSLIP: SVM based protein subcellular localization prediction using multiple physicochemical properties

BACKGROUND: Protein subcellular localization is an important determinant of protein function and hence, reliable methods for prediction of localization are needed. A number of prediction algorithms have been developed based on amino acid compositions or on the N-terminal characteristics (signal pept...

Descripción completa

Detalles Bibliográficos
Autores principales: Sarda, Deepak, Chua, Gek Huey, Li, Kuo-Bin, Krishnan, Arun
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182350/
https://www.ncbi.nlm.nih.gov/pubmed/15963230
http://dx.doi.org/10.1186/1471-2105-6-152
_version_ 1782124655476211712
author Sarda, Deepak
Chua, Gek Huey
Li, Kuo-Bin
Krishnan, Arun
author_facet Sarda, Deepak
Chua, Gek Huey
Li, Kuo-Bin
Krishnan, Arun
author_sort Sarda, Deepak
collection PubMed
description BACKGROUND: Protein subcellular localization is an important determinant of protein function and hence, reliable methods for prediction of localization are needed. A number of prediction algorithms have been developed based on amino acid compositions or on the N-terminal characteristics (signal peptides) of proteins. However, such approaches lead to a loss of contextual information. Moreover, where information about the physicochemical properties of amino acids has been used, the methods employed to exploit that information are less than optimal and could use the information more effectively. RESULTS: In this paper, we propose a new algorithm called pSLIP which uses Support Vector Machines (SVMs) in conjunction with multiple physicochemical properties of amino acids to predict protein subcellular localization in eukaryotes across six different locations, namely, chloroplast, cytoplasmic, extracellular, mitochondrial, nuclear and plasma membrane. The algorithm was applied to the dataset provided by Park and Kanehisa and we obtained prediction accuracies for the different classes ranging from 87.7% – 97.0% with an overall accuracy of 93.1%. CONCLUSION: This study presents a physicochemical property based protein localization prediction algorithm. Unlike other algorithms, contextual information is preserved by dividing the protein sequences into clusters. The prediction accuracy shows an improvement over other algorithms based on various types of amino acid composition (single, pair and gapped pair). We have also implemented a web server to predict protein localization across the six classes (available at ).
format Text
id pubmed-1182350
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-11823502005-08-04 pSLIP: SVM based protein subcellular localization prediction using multiple physicochemical properties Sarda, Deepak Chua, Gek Huey Li, Kuo-Bin Krishnan, Arun BMC Bioinformatics Research Article BACKGROUND: Protein subcellular localization is an important determinant of protein function and hence, reliable methods for prediction of localization are needed. A number of prediction algorithms have been developed based on amino acid compositions or on the N-terminal characteristics (signal peptides) of proteins. However, such approaches lead to a loss of contextual information. Moreover, where information about the physicochemical properties of amino acids has been used, the methods employed to exploit that information are less than optimal and could use the information more effectively. RESULTS: In this paper, we propose a new algorithm called pSLIP which uses Support Vector Machines (SVMs) in conjunction with multiple physicochemical properties of amino acids to predict protein subcellular localization in eukaryotes across six different locations, namely, chloroplast, cytoplasmic, extracellular, mitochondrial, nuclear and plasma membrane. The algorithm was applied to the dataset provided by Park and Kanehisa and we obtained prediction accuracies for the different classes ranging from 87.7% – 97.0% with an overall accuracy of 93.1%. CONCLUSION: This study presents a physicochemical property based protein localization prediction algorithm. Unlike other algorithms, contextual information is preserved by dividing the protein sequences into clusters. The prediction accuracy shows an improvement over other algorithms based on various types of amino acid composition (single, pair and gapped pair). We have also implemented a web server to predict protein localization across the six classes (available at ). BioMed Central 2005-06-17 /pmc/articles/PMC1182350/ /pubmed/15963230 http://dx.doi.org/10.1186/1471-2105-6-152 Text en Copyright © 2005 Sarda et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Sarda, Deepak
Chua, Gek Huey
Li, Kuo-Bin
Krishnan, Arun
pSLIP: SVM based protein subcellular localization prediction using multiple physicochemical properties
title pSLIP: SVM based protein subcellular localization prediction using multiple physicochemical properties
title_full pSLIP: SVM based protein subcellular localization prediction using multiple physicochemical properties
title_fullStr pSLIP: SVM based protein subcellular localization prediction using multiple physicochemical properties
title_full_unstemmed pSLIP: SVM based protein subcellular localization prediction using multiple physicochemical properties
title_short pSLIP: SVM based protein subcellular localization prediction using multiple physicochemical properties
title_sort pslip: svm based protein subcellular localization prediction using multiple physicochemical properties
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182350/
https://www.ncbi.nlm.nih.gov/pubmed/15963230
http://dx.doi.org/10.1186/1471-2105-6-152
work_keys_str_mv AT sardadeepak pslipsvmbasedproteinsubcellularlocalizationpredictionusingmultiplephysicochemicalproperties
AT chuagekhuey pslipsvmbasedproteinsubcellularlocalizationpredictionusingmultiplephysicochemicalproperties
AT likuobin pslipsvmbasedproteinsubcellularlocalizationpredictionusingmultiplephysicochemicalproperties
AT krishnanarun pslipsvmbasedproteinsubcellularlocalizationpredictionusingmultiplephysicochemicalproperties