Cargando…
GlycoPP: A Webserver for Prediction of N- and O-Glycosites in Prokaryotic Protein Sequences
Glycosylation is one of the most abundant post-translational modifications (PTMs) required for various structure/function modulations of proteins in a living cell. Although elucidated recently in prokaryotes, this type of PTM is present across all three domains of life. In prokaryotes, two types of...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3392279/ https://www.ncbi.nlm.nih.gov/pubmed/22808107 http://dx.doi.org/10.1371/journal.pone.0040155 |
_version_ | 1782237615701884928 |
---|---|
author | Chauhan, Jagat S. Bhat, Adil H. Raghava, Gajendra P. S. Rao, Alka |
author_facet | Chauhan, Jagat S. Bhat, Adil H. Raghava, Gajendra P. S. Rao, Alka |
author_sort | Chauhan, Jagat S. |
collection | PubMed |
description | Glycosylation is one of the most abundant post-translational modifications (PTMs) required for various structure/function modulations of proteins in a living cell. Although elucidated recently in prokaryotes, this type of PTM is present across all three domains of life. In prokaryotes, two types of protein glycan linkages are more widespread namely, N- linked, where a glycan moiety is attached to the amide group of Asn, and O- linked, where a glycan moiety is attached to the hydroxyl group of Ser/Thr/Tyr. For their biologically ubiquitous nature, significance, and technology applications, the study of prokaryotic glycoproteins is a fast emerging area of research. Here we describe new Support Vector Machine (SVM) based algorithms (models) developed for predicting glycosylated-residues (glycosites) with high accuracy in prokaryotic protein sequences. The models are based on binary profile of patterns, composition profile of patterns, and position-specific scoring matrix profile of patterns as training features. The study employ an extensive dataset of 107 N-linked and 116 O-linked glycosites extracted from 59 experimentally characterized glycoproteins of prokaryotes. This dataset includes validated N-glycosites from phyla Crenarchaeota, Euryarchaeota (domain Archaea), Proteobacteria (domain Bacteria) and validated O-glycosites from phyla Actinobacteria, Bacteroidetes, Firmicutes and Proteobacteria (domain Bacteria). In view of the current understanding that glycosylation occurs on folded proteins in bacteria, hybrid models have been developed using information on predicted secondary structures and accessible surface area in various combinations with training features. Using these models, N-glycosites and O-glycosites could be predicted with an accuracy of 82.71% (MCC 0.65) and 73.71% (MCC 0.48), respectively. An evaluation of the best performing models with 28 independent prokaryotic glycoproteins confirms the suitability of these models in predicting N- and O-glycosites in potential glycoproteins from aforementioned organisms, with reasonably high confidence. A web server GlycoPP, implementing these models is available freely at http:/www.imtech.res.in/raghava/glycopp/. |
format | Online Article Text |
id | pubmed-3392279 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-33922792012-07-17 GlycoPP: A Webserver for Prediction of N- and O-Glycosites in Prokaryotic Protein Sequences Chauhan, Jagat S. Bhat, Adil H. Raghava, Gajendra P. S. Rao, Alka PLoS One Research Article Glycosylation is one of the most abundant post-translational modifications (PTMs) required for various structure/function modulations of proteins in a living cell. Although elucidated recently in prokaryotes, this type of PTM is present across all three domains of life. In prokaryotes, two types of protein glycan linkages are more widespread namely, N- linked, where a glycan moiety is attached to the amide group of Asn, and O- linked, where a glycan moiety is attached to the hydroxyl group of Ser/Thr/Tyr. For their biologically ubiquitous nature, significance, and technology applications, the study of prokaryotic glycoproteins is a fast emerging area of research. Here we describe new Support Vector Machine (SVM) based algorithms (models) developed for predicting glycosylated-residues (glycosites) with high accuracy in prokaryotic protein sequences. The models are based on binary profile of patterns, composition profile of patterns, and position-specific scoring matrix profile of patterns as training features. The study employ an extensive dataset of 107 N-linked and 116 O-linked glycosites extracted from 59 experimentally characterized glycoproteins of prokaryotes. This dataset includes validated N-glycosites from phyla Crenarchaeota, Euryarchaeota (domain Archaea), Proteobacteria (domain Bacteria) and validated O-glycosites from phyla Actinobacteria, Bacteroidetes, Firmicutes and Proteobacteria (domain Bacteria). In view of the current understanding that glycosylation occurs on folded proteins in bacteria, hybrid models have been developed using information on predicted secondary structures and accessible surface area in various combinations with training features. Using these models, N-glycosites and O-glycosites could be predicted with an accuracy of 82.71% (MCC 0.65) and 73.71% (MCC 0.48), respectively. An evaluation of the best performing models with 28 independent prokaryotic glycoproteins confirms the suitability of these models in predicting N- and O-glycosites in potential glycoproteins from aforementioned organisms, with reasonably high confidence. A web server GlycoPP, implementing these models is available freely at http:/www.imtech.res.in/raghava/glycopp/. Public Library of Science 2012-07-09 /pmc/articles/PMC3392279/ /pubmed/22808107 http://dx.doi.org/10.1371/journal.pone.0040155 Text en Chauhan et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Chauhan, Jagat S. Bhat, Adil H. Raghava, Gajendra P. S. Rao, Alka GlycoPP: A Webserver for Prediction of N- and O-Glycosites in Prokaryotic Protein Sequences |
title | GlycoPP: A Webserver for Prediction of N- and O-Glycosites in Prokaryotic Protein Sequences |
title_full | GlycoPP: A Webserver for Prediction of N- and O-Glycosites in Prokaryotic Protein Sequences |
title_fullStr | GlycoPP: A Webserver for Prediction of N- and O-Glycosites in Prokaryotic Protein Sequences |
title_full_unstemmed | GlycoPP: A Webserver for Prediction of N- and O-Glycosites in Prokaryotic Protein Sequences |
title_short | GlycoPP: A Webserver for Prediction of N- and O-Glycosites in Prokaryotic Protein Sequences |
title_sort | glycopp: a webserver for prediction of n- and o-glycosites in prokaryotic protein sequences |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3392279/ https://www.ncbi.nlm.nih.gov/pubmed/22808107 http://dx.doi.org/10.1371/journal.pone.0040155 |
work_keys_str_mv | AT chauhanjagats glycoppawebserverforpredictionofnandoglycositesinprokaryoticproteinsequences AT bhatadilh glycoppawebserverforpredictionofnandoglycositesinprokaryoticproteinsequences AT raghavagajendraps glycoppawebserverforpredictionofnandoglycositesinprokaryoticproteinsequences AT raoalka glycoppawebserverforpredictionofnandoglycositesinprokaryoticproteinsequences |