Cargando…
ESLpred2: improved method for predicting subcellular localization of eukaryotic proteins
BACKGROUND: The expansion of raw protein sequence databases in the post genomic era and availability of fresh annotated sequences for major localizations particularly motivated us to introduce a new improved version of our previously forged eukaryotic subcellular localizations prediction method name...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2612013/ https://www.ncbi.nlm.nih.gov/pubmed/19038062 http://dx.doi.org/10.1186/1471-2105-9-503 |
_version_ | 1782163114070900736 |
---|---|
author | Garg, Aarti Raghava, Gajendra PS |
author_facet | Garg, Aarti Raghava, Gajendra PS |
author_sort | Garg, Aarti |
collection | PubMed |
description | BACKGROUND: The expansion of raw protein sequence databases in the post genomic era and availability of fresh annotated sequences for major localizations particularly motivated us to introduce a new improved version of our previously forged eukaryotic subcellular localizations prediction method namely "ESLpred". Since, subcellular localization of a protein offers essential clues about its functioning, hence, availability of localization predictor would definitely aid and expedite the protein deciphering studies. However, robustness of a predictor is highly dependent on the superiority of dataset and extracted protein attributes; hence, it becomes imperative to improve the performance of presently available method using latest dataset and crucial input features. RESULTS: Here, we describe augmentation in the prediction performance obtained for our most popular ESLpred method using new crucial features as an input to Support Vector Machine (SVM). In addition, recently available, highly non-redundant dataset encompassing three kingdoms specific protein sequence sets; 1198 fungi sequences, 2597 from animal and 491 plant sequences were also included in the present study. First, using the evolutionary information in the form of profile composition along with whole and N-terminal sequence composition as an input feature vector of 440 dimensions, overall accuracies of 72.7, 75.8 and 74.5% were achieved respectively after five-fold cross-validation. Further, enhancement in performance was observed when similarity search based results were coupled with whole and N-terminal sequence composition along with profile composition by yielding overall accuracies of 75.9, 80.8, 76.6% respectively; best accuracies reported till date on the same datasets. CONCLUSION: These results provide confidence about the reliability and accurate prediction of SVM modules generated in the present study using sequence and profile compositions along with similarity search based results. The presently developed modules are implemented as web server "ESLpred2" available at . |
format | Text |
id | pubmed-2612013 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-26120132009-01-12 ESLpred2: improved method for predicting subcellular localization of eukaryotic proteins Garg, Aarti Raghava, Gajendra PS BMC Bioinformatics Research Article BACKGROUND: The expansion of raw protein sequence databases in the post genomic era and availability of fresh annotated sequences for major localizations particularly motivated us to introduce a new improved version of our previously forged eukaryotic subcellular localizations prediction method namely "ESLpred". Since, subcellular localization of a protein offers essential clues about its functioning, hence, availability of localization predictor would definitely aid and expedite the protein deciphering studies. However, robustness of a predictor is highly dependent on the superiority of dataset and extracted protein attributes; hence, it becomes imperative to improve the performance of presently available method using latest dataset and crucial input features. RESULTS: Here, we describe augmentation in the prediction performance obtained for our most popular ESLpred method using new crucial features as an input to Support Vector Machine (SVM). In addition, recently available, highly non-redundant dataset encompassing three kingdoms specific protein sequence sets; 1198 fungi sequences, 2597 from animal and 491 plant sequences were also included in the present study. First, using the evolutionary information in the form of profile composition along with whole and N-terminal sequence composition as an input feature vector of 440 dimensions, overall accuracies of 72.7, 75.8 and 74.5% were achieved respectively after five-fold cross-validation. Further, enhancement in performance was observed when similarity search based results were coupled with whole and N-terminal sequence composition along with profile composition by yielding overall accuracies of 75.9, 80.8, 76.6% respectively; best accuracies reported till date on the same datasets. CONCLUSION: These results provide confidence about the reliability and accurate prediction of SVM modules generated in the present study using sequence and profile compositions along with similarity search based results. The presently developed modules are implemented as web server "ESLpred2" available at . BioMed Central 2008-11-28 /pmc/articles/PMC2612013/ /pubmed/19038062 http://dx.doi.org/10.1186/1471-2105-9-503 Text en Copyright © 2008 Garg and Raghava; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Garg, Aarti Raghava, Gajendra PS ESLpred2: improved method for predicting subcellular localization of eukaryotic proteins |
title | ESLpred2: improved method for predicting subcellular localization of eukaryotic proteins |
title_full | ESLpred2: improved method for predicting subcellular localization of eukaryotic proteins |
title_fullStr | ESLpred2: improved method for predicting subcellular localization of eukaryotic proteins |
title_full_unstemmed | ESLpred2: improved method for predicting subcellular localization of eukaryotic proteins |
title_short | ESLpred2: improved method for predicting subcellular localization of eukaryotic proteins |
title_sort | eslpred2: improved method for predicting subcellular localization of eukaryotic proteins |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2612013/ https://www.ncbi.nlm.nih.gov/pubmed/19038062 http://dx.doi.org/10.1186/1471-2105-9-503 |
work_keys_str_mv | AT gargaarti eslpred2improvedmethodforpredictingsubcellularlocalizationofeukaryoticproteins AT raghavagajendraps eslpred2improvedmethodforpredictingsubcellularlocalizationofeukaryoticproteins |