Cargando…
LocateP: Genome-scale subcellular-location predictor for bacterial proteins
BACKGROUND: In the past decades, various protein subcellular-location (SCL) predictors have been developed. Most of these predictors, like TMHMM 2.0, SignalP 3.0, PrediSi and Phobius, aim at the identification of one or a few SCLs, whereas others such as CELLO and Psortb.v.2.0 aim at a broader class...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2375117/ https://www.ncbi.nlm.nih.gov/pubmed/18371216 http://dx.doi.org/10.1186/1471-2105-9-173 |
_version_ | 1782154581355003904 |
---|---|
author | Zhou, Miaomiao Boekhorst, Jos Francke, Christof Siezen, Roland J |
author_facet | Zhou, Miaomiao Boekhorst, Jos Francke, Christof Siezen, Roland J |
author_sort | Zhou, Miaomiao |
collection | PubMed |
description | BACKGROUND: In the past decades, various protein subcellular-location (SCL) predictors have been developed. Most of these predictors, like TMHMM 2.0, SignalP 3.0, PrediSi and Phobius, aim at the identification of one or a few SCLs, whereas others such as CELLO and Psortb.v.2.0 aim at a broader classification. Although these tools and pipelines can achieve a high precision in the accurate prediction of signal peptides and transmembrane helices, they have a much lower accuracy when other sequence characteristics are concerned. For instance, it proved notoriously difficult to identify the fate of proteins carrying a putative type I signal peptidase (SPIase) cleavage site, as many of those proteins are retained in the cell membrane as N-terminally anchored membrane proteins. Moreover, most of the SCL classifiers are based on the classification of the Swiss-Prot database and consequently inherited the inconsistency of that SCL classification. As accurate and detailed SCL prediction on a genome scale is highly desired by experimental researchers, we decided to construct a new SCL prediction pipeline: LocateP. RESULTS: LocateP combines many of the existing high-precision SCL identifiers with our own newly developed identifiers for specific SCLs. The LocateP pipeline was designed such that it mimics protein targeting and secretion processes. It distinguishes 7 different SCLs within Gram-positive bacteria: intracellular, multi-transmembrane, N-terminally membrane anchored, C-terminally membrane anchored, lipid-anchored, LPxTG-type cell-wall anchored, and secreted/released proteins. Moreover, it distinguishes pathways for Sec- or Tat-dependent secretion and alternative secretion of bacteriocin-like proteins. The pipeline was tested on data sets extracted from literature, including experimental proteomics studies. The tests showed that LocateP performs as well as, or even slightly better than other SCL predictors for some locations and outperforms current tools especially where the N-terminally anchored and the SPIase-cleaved secreted proteins are concerned. Overall, the accuracy of LocateP was always higher than 90%. LocateP was then used to predict the SCLs of all proteins encoded by completed Gram-positive bacterial genomes. The results are stored in the database LocateP-DB [1]. CONCLUSION: LocateP is by far the most accurate and detailed protein SCL predictor for Gram-positive bacteria currently available. |
format | Text |
id | pubmed-2375117 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-23751172008-05-10 LocateP: Genome-scale subcellular-location predictor for bacterial proteins Zhou, Miaomiao Boekhorst, Jos Francke, Christof Siezen, Roland J BMC Bioinformatics Methodology Article BACKGROUND: In the past decades, various protein subcellular-location (SCL) predictors have been developed. Most of these predictors, like TMHMM 2.0, SignalP 3.0, PrediSi and Phobius, aim at the identification of one or a few SCLs, whereas others such as CELLO and Psortb.v.2.0 aim at a broader classification. Although these tools and pipelines can achieve a high precision in the accurate prediction of signal peptides and transmembrane helices, they have a much lower accuracy when other sequence characteristics are concerned. For instance, it proved notoriously difficult to identify the fate of proteins carrying a putative type I signal peptidase (SPIase) cleavage site, as many of those proteins are retained in the cell membrane as N-terminally anchored membrane proteins. Moreover, most of the SCL classifiers are based on the classification of the Swiss-Prot database and consequently inherited the inconsistency of that SCL classification. As accurate and detailed SCL prediction on a genome scale is highly desired by experimental researchers, we decided to construct a new SCL prediction pipeline: LocateP. RESULTS: LocateP combines many of the existing high-precision SCL identifiers with our own newly developed identifiers for specific SCLs. The LocateP pipeline was designed such that it mimics protein targeting and secretion processes. It distinguishes 7 different SCLs within Gram-positive bacteria: intracellular, multi-transmembrane, N-terminally membrane anchored, C-terminally membrane anchored, lipid-anchored, LPxTG-type cell-wall anchored, and secreted/released proteins. Moreover, it distinguishes pathways for Sec- or Tat-dependent secretion and alternative secretion of bacteriocin-like proteins. The pipeline was tested on data sets extracted from literature, including experimental proteomics studies. The tests showed that LocateP performs as well as, or even slightly better than other SCL predictors for some locations and outperforms current tools especially where the N-terminally anchored and the SPIase-cleaved secreted proteins are concerned. Overall, the accuracy of LocateP was always higher than 90%. LocateP was then used to predict the SCLs of all proteins encoded by completed Gram-positive bacterial genomes. The results are stored in the database LocateP-DB [1]. CONCLUSION: LocateP is by far the most accurate and detailed protein SCL predictor for Gram-positive bacteria currently available. BioMed Central 2008-03-27 /pmc/articles/PMC2375117/ /pubmed/18371216 http://dx.doi.org/10.1186/1471-2105-9-173 Text en Copyright © 2008 Zhou et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Zhou, Miaomiao Boekhorst, Jos Francke, Christof Siezen, Roland J LocateP: Genome-scale subcellular-location predictor for bacterial proteins |
title | LocateP: Genome-scale subcellular-location predictor for bacterial proteins |
title_full | LocateP: Genome-scale subcellular-location predictor for bacterial proteins |
title_fullStr | LocateP: Genome-scale subcellular-location predictor for bacterial proteins |
title_full_unstemmed | LocateP: Genome-scale subcellular-location predictor for bacterial proteins |
title_short | LocateP: Genome-scale subcellular-location predictor for bacterial proteins |
title_sort | locatep: genome-scale subcellular-location predictor for bacterial proteins |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2375117/ https://www.ncbi.nlm.nih.gov/pubmed/18371216 http://dx.doi.org/10.1186/1471-2105-9-173 |
work_keys_str_mv | AT zhoumiaomiao locatepgenomescalesubcellularlocationpredictorforbacterialproteins AT boekhorstjos locatepgenomescalesubcellularlocationpredictorforbacterialproteins AT franckechristof locatepgenomescalesubcellularlocationpredictorforbacterialproteins AT siezenrolandj locatepgenomescalesubcellularlocationpredictorforbacterialproteins |