Cargando…

Validating subcellular localization prediction tools with mycobacterial proteins

BACKGROUND: The computational prediction of mycobacterial proteins' subcellular localization is of key importance for proteome annotation and for the identification of new drug targets and vaccine candidates. Several subcellular localization classifiers have been developed over the past few yea...

Descripción completa

Detalles Bibliográficos
Autores principales: Restrepo-Montoya, Daniel, Vizcaíno, Carolina, Niño, Luis F, Ocampo, Marisol, Patarroyo, Manuel E, Patarroyo, Manuel A
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2685389/
https://www.ncbi.nlm.nih.gov/pubmed/19422713
http://dx.doi.org/10.1186/1471-2105-10-134
_version_ 1782167319085056000
author Restrepo-Montoya, Daniel
Vizcaíno, Carolina
Niño, Luis F
Ocampo, Marisol
Patarroyo, Manuel E
Patarroyo, Manuel A
author_facet Restrepo-Montoya, Daniel
Vizcaíno, Carolina
Niño, Luis F
Ocampo, Marisol
Patarroyo, Manuel E
Patarroyo, Manuel A
author_sort Restrepo-Montoya, Daniel
collection PubMed
description BACKGROUND: The computational prediction of mycobacterial proteins' subcellular localization is of key importance for proteome annotation and for the identification of new drug targets and vaccine candidates. Several subcellular localization classifiers have been developed over the past few years, which have comprised both general localization and feature-based classifiers. Here, we have validated the ability of different bioinformatics approaches, through the use of SignalP 2.0, TatP 1.0, LipoP 1.0, Phobius, PA-SUB 2.5, PSORTb v.2.0.4 and Gpos-PLoc, to predict secreted bacterial proteins. These computational tools were compared in terms of sensitivity, specificity and Matthew's correlation coefficient (MCC) using a set of mycobacterial proteins having less than 40% identity, none of which are included in the training data sets of the validated tools and whose subcellular localization have been experimentally confirmed. These proteins belong to the TBpred training data set, a computational tool specifically designed to predict mycobacterial proteins. RESULTS: A final validation set of 272 mycobacterial proteins was obtained from the initial set of 852 mycobacterial proteins. According to the results of the validation metrics, all tools presented specificity above 0.90, while dispersion sensitivity and MCC values were above 0.22. PA-SUB 2.5 presented the highest values; however, these results might be biased due to the methodology used by this tool. PSORTb v.2.0.4 left 56 proteins out of the classification, while Gpos-PLoc left just one protein out. CONCLUSION: Both subcellular localization approaches had high predictive specificity and high recognition of true negatives for the tested data set. Among those tools whose predictions are not based on homology searches against SWISS-PROT, Gpos-PLoc was the general localization tool with the best predictive performance, while SignalP 2.0 was the best tool among the ones using a feature-based approach. Even though PA-SUB 2.5 presented the highest metrics, it should be taken into account that this tool was trained using all proteins reported in SWISS-PROT, which includes the protein set tested in this study, either as a BLAST search or as a training model.
format Text
id pubmed-2685389
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26853892009-05-22 Validating subcellular localization prediction tools with mycobacterial proteins Restrepo-Montoya, Daniel Vizcaíno, Carolina Niño, Luis F Ocampo, Marisol Patarroyo, Manuel E Patarroyo, Manuel A BMC Bioinformatics Research Article BACKGROUND: The computational prediction of mycobacterial proteins' subcellular localization is of key importance for proteome annotation and for the identification of new drug targets and vaccine candidates. Several subcellular localization classifiers have been developed over the past few years, which have comprised both general localization and feature-based classifiers. Here, we have validated the ability of different bioinformatics approaches, through the use of SignalP 2.0, TatP 1.0, LipoP 1.0, Phobius, PA-SUB 2.5, PSORTb v.2.0.4 and Gpos-PLoc, to predict secreted bacterial proteins. These computational tools were compared in terms of sensitivity, specificity and Matthew's correlation coefficient (MCC) using a set of mycobacterial proteins having less than 40% identity, none of which are included in the training data sets of the validated tools and whose subcellular localization have been experimentally confirmed. These proteins belong to the TBpred training data set, a computational tool specifically designed to predict mycobacterial proteins. RESULTS: A final validation set of 272 mycobacterial proteins was obtained from the initial set of 852 mycobacterial proteins. According to the results of the validation metrics, all tools presented specificity above 0.90, while dispersion sensitivity and MCC values were above 0.22. PA-SUB 2.5 presented the highest values; however, these results might be biased due to the methodology used by this tool. PSORTb v.2.0.4 left 56 proteins out of the classification, while Gpos-PLoc left just one protein out. CONCLUSION: Both subcellular localization approaches had high predictive specificity and high recognition of true negatives for the tested data set. Among those tools whose predictions are not based on homology searches against SWISS-PROT, Gpos-PLoc was the general localization tool with the best predictive performance, while SignalP 2.0 was the best tool among the ones using a feature-based approach. Even though PA-SUB 2.5 presented the highest metrics, it should be taken into account that this tool was trained using all proteins reported in SWISS-PROT, which includes the protein set tested in this study, either as a BLAST search or as a training model. BioMed Central 2009-05-07 /pmc/articles/PMC2685389/ /pubmed/19422713 http://dx.doi.org/10.1186/1471-2105-10-134 Text en Copyright © 2009 Restrepo-Montoya et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Restrepo-Montoya, Daniel
Vizcaíno, Carolina
Niño, Luis F
Ocampo, Marisol
Patarroyo, Manuel E
Patarroyo, Manuel A
Validating subcellular localization prediction tools with mycobacterial proteins
title Validating subcellular localization prediction tools with mycobacterial proteins
title_full Validating subcellular localization prediction tools with mycobacterial proteins
title_fullStr Validating subcellular localization prediction tools with mycobacterial proteins
title_full_unstemmed Validating subcellular localization prediction tools with mycobacterial proteins
title_short Validating subcellular localization prediction tools with mycobacterial proteins
title_sort validating subcellular localization prediction tools with mycobacterial proteins
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2685389/
https://www.ncbi.nlm.nih.gov/pubmed/19422713
http://dx.doi.org/10.1186/1471-2105-10-134
work_keys_str_mv AT restrepomontoyadaniel validatingsubcellularlocalizationpredictiontoolswithmycobacterialproteins
AT vizcainocarolina validatingsubcellularlocalizationpredictiontoolswithmycobacterialproteins
AT ninoluisf validatingsubcellularlocalizationpredictiontoolswithmycobacterialproteins
AT ocampomarisol validatingsubcellularlocalizationpredictiontoolswithmycobacterialproteins
AT patarroyomanuele validatingsubcellularlocalizationpredictiontoolswithmycobacterialproteins
AT patarroyomanuela validatingsubcellularlocalizationpredictiontoolswithmycobacterialproteins