Cargando…
dbOGAP - An Integrated Bioinformatics Resource for Protein O-GlcNAcylation
BACKGROUND: Protein O-GlcNAcylation (or O-GlcNAc-ylation) is an O-linked glycosylation involving the transfer of β-N-acetylglucosamine to the hydroxyl group of serine or threonine residues of proteins. Growing evidences suggest that protein O-GlcNAcylation is common and is analogous to phosphorylati...
Autores principales: | , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3083348/ https://www.ncbi.nlm.nih.gov/pubmed/21466708 http://dx.doi.org/10.1186/1471-2105-12-91 |
_version_ | 1782202381644070912 |
---|---|
author | Wang, Jinlian Torii, Manabu Liu, Hongfang Hart, Gerald W Hu, Zhang-Zhi |
author_facet | Wang, Jinlian Torii, Manabu Liu, Hongfang Hart, Gerald W Hu, Zhang-Zhi |
author_sort | Wang, Jinlian |
collection | PubMed |
description | BACKGROUND: Protein O-GlcNAcylation (or O-GlcNAc-ylation) is an O-linked glycosylation involving the transfer of β-N-acetylglucosamine to the hydroxyl group of serine or threonine residues of proteins. Growing evidences suggest that protein O-GlcNAcylation is common and is analogous to phosphorylation in modulating broad ranges of biological processes. However, compared to phosphorylation, the amount of protein O-GlcNAcylation data is relatively limited and its annotation in databases is scarce. Furthermore, a bioinformatics resource for O-GlcNAcylation is lacking, and an O-GlcNAcylation site prediction tool is much needed. DESCRIPTION: We developed a database of O-GlcNAcylated proteins and sites, dbOGAP, primarily based on literature published since O-GlcNAcylation was first described in 1984. The database currently contains ~800 proteins with experimental O-GlcNAcylation information, of which ~61% are of humans, and 172 proteins have a total of ~400 O-GlcNAcylation sites identified. The O-GlcNAcylated proteins are primarily nucleocytoplasmic, including membrane- and non-membrane bounded organelle-associated proteins. The known O-GlcNAcylated proteins exert a broad range of functions including transcriptional regulation, macromolecular complex assembly, intracellular transport, translation, and regulation of cell growth or death. The database also contains ~365 potential O-GlcNAcylated proteins inferred from known O-GlcNAcylated orthologs. Additional annotations, including other protein posttranslational modifications, biological pathways and disease information are integrated into the database. We developed an O-GlcNAcylation site prediction system, OGlcNAcScan, based on Support Vector Machine and trained using protein sequences with known O-GlcNAcylation sites from dbOGAP. The site prediction system achieved an area under ROC curve of 74.3% in five-fold cross-validation. The dbOGAP website was developed to allow for performing search and query on O-GlcNAcylated proteins and associated literature, as well as for browsing by gene names, organisms or pathways, and downloading of the database. Also available from the website, the OGlcNAcScan tool presents a list of predicted O-GlcNAcylation sites for given protein sequences. CONCLUSIONS: dbOGAP is the first public bioinformatics resource to allow systematic access to the O-GlcNAcylated proteins, and related functional information and bibliography, as well as to an O-GlcNAcylation site prediction tool. The resource will facilitate research on O-GlcNAcylation and its proteomic identification. |
format | Text |
id | pubmed-3083348 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-30833482011-04-28 dbOGAP - An Integrated Bioinformatics Resource for Protein O-GlcNAcylation Wang, Jinlian Torii, Manabu Liu, Hongfang Hart, Gerald W Hu, Zhang-Zhi BMC Bioinformatics Database BACKGROUND: Protein O-GlcNAcylation (or O-GlcNAc-ylation) is an O-linked glycosylation involving the transfer of β-N-acetylglucosamine to the hydroxyl group of serine or threonine residues of proteins. Growing evidences suggest that protein O-GlcNAcylation is common and is analogous to phosphorylation in modulating broad ranges of biological processes. However, compared to phosphorylation, the amount of protein O-GlcNAcylation data is relatively limited and its annotation in databases is scarce. Furthermore, a bioinformatics resource for O-GlcNAcylation is lacking, and an O-GlcNAcylation site prediction tool is much needed. DESCRIPTION: We developed a database of O-GlcNAcylated proteins and sites, dbOGAP, primarily based on literature published since O-GlcNAcylation was first described in 1984. The database currently contains ~800 proteins with experimental O-GlcNAcylation information, of which ~61% are of humans, and 172 proteins have a total of ~400 O-GlcNAcylation sites identified. The O-GlcNAcylated proteins are primarily nucleocytoplasmic, including membrane- and non-membrane bounded organelle-associated proteins. The known O-GlcNAcylated proteins exert a broad range of functions including transcriptional regulation, macromolecular complex assembly, intracellular transport, translation, and regulation of cell growth or death. The database also contains ~365 potential O-GlcNAcylated proteins inferred from known O-GlcNAcylated orthologs. Additional annotations, including other protein posttranslational modifications, biological pathways and disease information are integrated into the database. We developed an O-GlcNAcylation site prediction system, OGlcNAcScan, based on Support Vector Machine and trained using protein sequences with known O-GlcNAcylation sites from dbOGAP. The site prediction system achieved an area under ROC curve of 74.3% in five-fold cross-validation. The dbOGAP website was developed to allow for performing search and query on O-GlcNAcylated proteins and associated literature, as well as for browsing by gene names, organisms or pathways, and downloading of the database. Also available from the website, the OGlcNAcScan tool presents a list of predicted O-GlcNAcylation sites for given protein sequences. CONCLUSIONS: dbOGAP is the first public bioinformatics resource to allow systematic access to the O-GlcNAcylated proteins, and related functional information and bibliography, as well as to an O-GlcNAcylation site prediction tool. The resource will facilitate research on O-GlcNAcylation and its proteomic identification. BioMed Central 2011-04-06 /pmc/articles/PMC3083348/ /pubmed/21466708 http://dx.doi.org/10.1186/1471-2105-12-91 Text en Copyright ©2011 Wang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Database Wang, Jinlian Torii, Manabu Liu, Hongfang Hart, Gerald W Hu, Zhang-Zhi dbOGAP - An Integrated Bioinformatics Resource for Protein O-GlcNAcylation |
title | dbOGAP - An Integrated Bioinformatics Resource for Protein O-GlcNAcylation |
title_full | dbOGAP - An Integrated Bioinformatics Resource for Protein O-GlcNAcylation |
title_fullStr | dbOGAP - An Integrated Bioinformatics Resource for Protein O-GlcNAcylation |
title_full_unstemmed | dbOGAP - An Integrated Bioinformatics Resource for Protein O-GlcNAcylation |
title_short | dbOGAP - An Integrated Bioinformatics Resource for Protein O-GlcNAcylation |
title_sort | dbogap - an integrated bioinformatics resource for protein o-glcnacylation |
topic | Database |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3083348/ https://www.ncbi.nlm.nih.gov/pubmed/21466708 http://dx.doi.org/10.1186/1471-2105-12-91 |
work_keys_str_mv | AT wangjinlian dbogapanintegratedbioinformaticsresourceforproteinoglcnacylation AT toriimanabu dbogapanintegratedbioinformaticsresourceforproteinoglcnacylation AT liuhongfang dbogapanintegratedbioinformaticsresourceforproteinoglcnacylation AT hartgeraldw dbogapanintegratedbioinformaticsresourceforproteinoglcnacylation AT huzhangzhi dbogapanintegratedbioinformaticsresourceforproteinoglcnacylation |