Cargando…

GlycoPP: A Webserver for Prediction of N- and O-Glycosites in Prokaryotic Protein Sequences

Glycosylation is one of the most abundant post-translational modifications (PTMs) required for various structure/function modulations of proteins in a living cell. Although elucidated recently in prokaryotes, this type of PTM is present across all three domains of life. In prokaryotes, two types of...

Descripción completa

Detalles Bibliográficos
Autores principales: Chauhan, Jagat S., Bhat, Adil H., Raghava, Gajendra P. S., Rao, Alka
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3392279/
https://www.ncbi.nlm.nih.gov/pubmed/22808107
http://dx.doi.org/10.1371/journal.pone.0040155
_version_ 1782237615701884928
author Chauhan, Jagat S.
Bhat, Adil H.
Raghava, Gajendra P. S.
Rao, Alka
author_facet Chauhan, Jagat S.
Bhat, Adil H.
Raghava, Gajendra P. S.
Rao, Alka
author_sort Chauhan, Jagat S.
collection PubMed
description Glycosylation is one of the most abundant post-translational modifications (PTMs) required for various structure/function modulations of proteins in a living cell. Although elucidated recently in prokaryotes, this type of PTM is present across all three domains of life. In prokaryotes, two types of protein glycan linkages are more widespread namely, N- linked, where a glycan moiety is attached to the amide group of Asn, and O- linked, where a glycan moiety is attached to the hydroxyl group of Ser/Thr/Tyr. For their biologically ubiquitous nature, significance, and technology applications, the study of prokaryotic glycoproteins is a fast emerging area of research. Here we describe new Support Vector Machine (SVM) based algorithms (models) developed for predicting glycosylated-residues (glycosites) with high accuracy in prokaryotic protein sequences. The models are based on binary profile of patterns, composition profile of patterns, and position-specific scoring matrix profile of patterns as training features. The study employ an extensive dataset of 107 N-linked and 116 O-linked glycosites extracted from 59 experimentally characterized glycoproteins of prokaryotes. This dataset includes validated N-glycosites from phyla Crenarchaeota, Euryarchaeota (domain Archaea), Proteobacteria (domain Bacteria) and validated O-glycosites from phyla Actinobacteria, Bacteroidetes, Firmicutes and Proteobacteria (domain Bacteria). In view of the current understanding that glycosylation occurs on folded proteins in bacteria, hybrid models have been developed using information on predicted secondary structures and accessible surface area in various combinations with training features. Using these models, N-glycosites and O-glycosites could be predicted with an accuracy of 82.71% (MCC 0.65) and 73.71% (MCC 0.48), respectively. An evaluation of the best performing models with 28 independent prokaryotic glycoproteins confirms the suitability of these models in predicting N- and O-glycosites in potential glycoproteins from aforementioned organisms, with reasonably high confidence. A web server GlycoPP, implementing these models is available freely at http:/www.imtech.res.in/raghava/glycopp/.
format Online
Article
Text
id pubmed-3392279
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-33922792012-07-17 GlycoPP: A Webserver for Prediction of N- and O-Glycosites in Prokaryotic Protein Sequences Chauhan, Jagat S. Bhat, Adil H. Raghava, Gajendra P. S. Rao, Alka PLoS One Research Article Glycosylation is one of the most abundant post-translational modifications (PTMs) required for various structure/function modulations of proteins in a living cell. Although elucidated recently in prokaryotes, this type of PTM is present across all three domains of life. In prokaryotes, two types of protein glycan linkages are more widespread namely, N- linked, where a glycan moiety is attached to the amide group of Asn, and O- linked, where a glycan moiety is attached to the hydroxyl group of Ser/Thr/Tyr. For their biologically ubiquitous nature, significance, and technology applications, the study of prokaryotic glycoproteins is a fast emerging area of research. Here we describe new Support Vector Machine (SVM) based algorithms (models) developed for predicting glycosylated-residues (glycosites) with high accuracy in prokaryotic protein sequences. The models are based on binary profile of patterns, composition profile of patterns, and position-specific scoring matrix profile of patterns as training features. The study employ an extensive dataset of 107 N-linked and 116 O-linked glycosites extracted from 59 experimentally characterized glycoproteins of prokaryotes. This dataset includes validated N-glycosites from phyla Crenarchaeota, Euryarchaeota (domain Archaea), Proteobacteria (domain Bacteria) and validated O-glycosites from phyla Actinobacteria, Bacteroidetes, Firmicutes and Proteobacteria (domain Bacteria). In view of the current understanding that glycosylation occurs on folded proteins in bacteria, hybrid models have been developed using information on predicted secondary structures and accessible surface area in various combinations with training features. Using these models, N-glycosites and O-glycosites could be predicted with an accuracy of 82.71% (MCC 0.65) and 73.71% (MCC 0.48), respectively. An evaluation of the best performing models with 28 independent prokaryotic glycoproteins confirms the suitability of these models in predicting N- and O-glycosites in potential glycoproteins from aforementioned organisms, with reasonably high confidence. A web server GlycoPP, implementing these models is available freely at http:/www.imtech.res.in/raghava/glycopp/. Public Library of Science 2012-07-09 /pmc/articles/PMC3392279/ /pubmed/22808107 http://dx.doi.org/10.1371/journal.pone.0040155 Text en Chauhan et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Chauhan, Jagat S.
Bhat, Adil H.
Raghava, Gajendra P. S.
Rao, Alka
GlycoPP: A Webserver for Prediction of N- and O-Glycosites in Prokaryotic Protein Sequences
title GlycoPP: A Webserver for Prediction of N- and O-Glycosites in Prokaryotic Protein Sequences
title_full GlycoPP: A Webserver for Prediction of N- and O-Glycosites in Prokaryotic Protein Sequences
title_fullStr GlycoPP: A Webserver for Prediction of N- and O-Glycosites in Prokaryotic Protein Sequences
title_full_unstemmed GlycoPP: A Webserver for Prediction of N- and O-Glycosites in Prokaryotic Protein Sequences
title_short GlycoPP: A Webserver for Prediction of N- and O-Glycosites in Prokaryotic Protein Sequences
title_sort glycopp: a webserver for prediction of n- and o-glycosites in prokaryotic protein sequences
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3392279/
https://www.ncbi.nlm.nih.gov/pubmed/22808107
http://dx.doi.org/10.1371/journal.pone.0040155
work_keys_str_mv AT chauhanjagats glycoppawebserverforpredictionofnandoglycositesinprokaryoticproteinsequences
AT bhatadilh glycoppawebserverforpredictionofnandoglycositesinprokaryoticproteinsequences
AT raghavagajendraps glycoppawebserverforpredictionofnandoglycositesinprokaryoticproteinsequences
AT raoalka glycoppawebserverforpredictionofnandoglycositesinprokaryoticproteinsequences