Cargando…

Characterization and identification of protein O-GlcNAcylation sites with substrate specificity

BACKGROUND: Protein O-GlcNAcylation, involving the attachment of single N-acetylglucosamine (GlcNAc) to the hydroxyl group of serine or threonine residues. Elucidation of O-GlcNAcylation sites on proteins is required in order to decipher its crucial roles in regulating cellular processes and aid in...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Hsin-Yi, Lu, Cheng-Tsung, Kao, Hui-Ju, Chen, Yi-Ju, Chen, Yu-Ju, Lee, Tzong-Yi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4290634/
https://www.ncbi.nlm.nih.gov/pubmed/25521204
http://dx.doi.org/10.1186/1471-2105-15-S16-S1
_version_ 1782352276904476672
author Wu, Hsin-Yi
Lu, Cheng-Tsung
Kao, Hui-Ju
Chen, Yi-Ju
Chen, Yu-Ju
Lee, Tzong-Yi
author_facet Wu, Hsin-Yi
Lu, Cheng-Tsung
Kao, Hui-Ju
Chen, Yi-Ju
Chen, Yu-Ju
Lee, Tzong-Yi
author_sort Wu, Hsin-Yi
collection PubMed
description BACKGROUND: Protein O-GlcNAcylation, involving the attachment of single N-acetylglucosamine (GlcNAc) to the hydroxyl group of serine or threonine residues. Elucidation of O-GlcNAcylation sites on proteins is required in order to decipher its crucial roles in regulating cellular processes and aid in drug design. With an increasing number of O-GlcNAcylation sites identified by mass spectrometry (MS)-based proteomics, several methods have been proposed for the computational identification of O-GlcNAcylation sites. However, no development that focuses on the investigation of O-GlcNAcylated substrate motifs has existed. Thus, we were motivated to design a new method for the identification of protein O-GlcNAcylation sites with the consideration of substrate site specificity. RESULTS: In this study, 375 experimentally verified O-GlcNAcylation sites were collected from dbOGAP, which is an integrated resource for protein O-GlcNAcylation. Due to the difficulty in characterizing the substrate motifs by conventional sequence logo analysis, a recursively statistical method has been applied to obtain significant conserved motifs. To construct the predictive models learned from the identified substrate motifs, we adopted Support Vector Machines (SVMs). A five-fold cross validation was used to evaluate the predictive model, achieving sensitivity, specificity, and accuracy of 0.76, 0.80, and 0.78, respectively. Additionally, an independent testing set, which was really blind to the training data of predictive model, was used to demonstrate that the proposed method could provide a promising accuracy (0.94) and outperform three other O-GlcNAcylation site prediction tools. CONCLUSION: This work proposed a computational method to identify informative substrate motifs for O-GlcNAcylation sites. The evaluation of cross validation and independent testing indicated that the identified motifs were effective in the identification of O-GlcNAcylation sites. A case study demonstrated that the proposed method could be a feasible means of conducting preliminary analyses of protein O-GlcNAcylation. We also anticipated that the revealed substrate motif may facilitate the study of extensive crosstalk between O-GlcNAcylation and phosphorylation. This method may help unravel their mechanisms and roles in signaling, transcription, chronic disease, and cancer.
format Online
Article
Text
id pubmed-4290634
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42906342015-01-15 Characterization and identification of protein O-GlcNAcylation sites with substrate specificity Wu, Hsin-Yi Lu, Cheng-Tsung Kao, Hui-Ju Chen, Yi-Ju Chen, Yu-Ju Lee, Tzong-Yi BMC Bioinformatics Research BACKGROUND: Protein O-GlcNAcylation, involving the attachment of single N-acetylglucosamine (GlcNAc) to the hydroxyl group of serine or threonine residues. Elucidation of O-GlcNAcylation sites on proteins is required in order to decipher its crucial roles in regulating cellular processes and aid in drug design. With an increasing number of O-GlcNAcylation sites identified by mass spectrometry (MS)-based proteomics, several methods have been proposed for the computational identification of O-GlcNAcylation sites. However, no development that focuses on the investigation of O-GlcNAcylated substrate motifs has existed. Thus, we were motivated to design a new method for the identification of protein O-GlcNAcylation sites with the consideration of substrate site specificity. RESULTS: In this study, 375 experimentally verified O-GlcNAcylation sites were collected from dbOGAP, which is an integrated resource for protein O-GlcNAcylation. Due to the difficulty in characterizing the substrate motifs by conventional sequence logo analysis, a recursively statistical method has been applied to obtain significant conserved motifs. To construct the predictive models learned from the identified substrate motifs, we adopted Support Vector Machines (SVMs). A five-fold cross validation was used to evaluate the predictive model, achieving sensitivity, specificity, and accuracy of 0.76, 0.80, and 0.78, respectively. Additionally, an independent testing set, which was really blind to the training data of predictive model, was used to demonstrate that the proposed method could provide a promising accuracy (0.94) and outperform three other O-GlcNAcylation site prediction tools. CONCLUSION: This work proposed a computational method to identify informative substrate motifs for O-GlcNAcylation sites. The evaluation of cross validation and independent testing indicated that the identified motifs were effective in the identification of O-GlcNAcylation sites. A case study demonstrated that the proposed method could be a feasible means of conducting preliminary analyses of protein O-GlcNAcylation. We also anticipated that the revealed substrate motif may facilitate the study of extensive crosstalk between O-GlcNAcylation and phosphorylation. This method may help unravel their mechanisms and roles in signaling, transcription, chronic disease, and cancer. BioMed Central 2014-12-08 /pmc/articles/PMC4290634/ /pubmed/25521204 http://dx.doi.org/10.1186/1471-2105-15-S16-S1 Text en Copyright © 2014 Wu et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Wu, Hsin-Yi
Lu, Cheng-Tsung
Kao, Hui-Ju
Chen, Yi-Ju
Chen, Yu-Ju
Lee, Tzong-Yi
Characterization and identification of protein O-GlcNAcylation sites with substrate specificity
title Characterization and identification of protein O-GlcNAcylation sites with substrate specificity
title_full Characterization and identification of protein O-GlcNAcylation sites with substrate specificity
title_fullStr Characterization and identification of protein O-GlcNAcylation sites with substrate specificity
title_full_unstemmed Characterization and identification of protein O-GlcNAcylation sites with substrate specificity
title_short Characterization and identification of protein O-GlcNAcylation sites with substrate specificity
title_sort characterization and identification of protein o-glcnacylation sites with substrate specificity
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4290634/
https://www.ncbi.nlm.nih.gov/pubmed/25521204
http://dx.doi.org/10.1186/1471-2105-15-S16-S1
work_keys_str_mv AT wuhsinyi characterizationandidentificationofproteinoglcnacylationsiteswithsubstratespecificity
AT luchengtsung characterizationandidentificationofproteinoglcnacylationsiteswithsubstratespecificity
AT kaohuiju characterizationandidentificationofproteinoglcnacylationsiteswithsubstratespecificity
AT chenyiju characterizationandidentificationofproteinoglcnacylationsiteswithsubstratespecificity
AT chenyuju characterizationandidentificationofproteinoglcnacylationsiteswithsubstratespecificity
AT leetzongyi characterizationandidentificationofproteinoglcnacylationsiteswithsubstratespecificity