Cargando…

Machine learning approach to predict protein phosphorylation sites by incorporating evolutionary information

BACKGROUND: Most of the existing in silico phosphorylation site prediction systems use machine learning approach that requires preparing a good set of classification data in order to build the classification knowledge. Furthermore, phosphorylation is catalyzed by kinase enzymes and hence the kinase...

Descripción completa

Detalles Bibliográficos
Autores principales: Biswas, Ashis Kumer, Noman, Nasimul, Sikder, Abdur Rahman
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2887807/
https://www.ncbi.nlm.nih.gov/pubmed/20492656
http://dx.doi.org/10.1186/1471-2105-11-273
_version_ 1782182587777679360
author Biswas, Ashis Kumer
Noman, Nasimul
Sikder, Abdur Rahman
author_facet Biswas, Ashis Kumer
Noman, Nasimul
Sikder, Abdur Rahman
author_sort Biswas, Ashis Kumer
collection PubMed
description BACKGROUND: Most of the existing in silico phosphorylation site prediction systems use machine learning approach that requires preparing a good set of classification data in order to build the classification knowledge. Furthermore, phosphorylation is catalyzed by kinase enzymes and hence the kinase information of the phosphorylated sites has been used as major classification data in most of the existing systems. Since the number of kinase annotations in protein sequences is far less than that of the proteins being sequenced to date, the prediction systems that use the information found from the small clique of kinase annotated proteins can not be considered as completely perfect for predicting outside the clique. Hence the systems are certainly not generalized. In this paper, a novel generalized prediction system, PPRED (Phosphorylation PREDictor) is proposed that ignores the kinase information and only uses the evolutionary information of proteins for classifying phosphorylation sites. RESULTS: Experimental results based on cross validations and an independent benchmark reveal the significance of using the evolutionary information alone to classify phosphorylation sites from protein sequences. The prediction performance of the proposed system is better than those of the existing prediction systems that also do not incorporate kinase information. The system is also comparable to systems that incorporate kinase information in predicting such sites. CONCLUSIONS: The approach presented in this paper provides an efficient way to identify phosphorylation sites in a given protein primary sequence that would be a valuable information for the molecular biologists working on protein phosphorylation sites and for bioinformaticians developing generalized prediction systems for the post translational modifications like phosphorylation or glycosylation. PPRED is publicly available at the URL http://www.cse.univdhaka.edu/~ashis/ppred/index.php.
format Text
id pubmed-2887807
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-28878072010-06-19 Machine learning approach to predict protein phosphorylation sites by incorporating evolutionary information Biswas, Ashis Kumer Noman, Nasimul Sikder, Abdur Rahman BMC Bioinformatics Research article BACKGROUND: Most of the existing in silico phosphorylation site prediction systems use machine learning approach that requires preparing a good set of classification data in order to build the classification knowledge. Furthermore, phosphorylation is catalyzed by kinase enzymes and hence the kinase information of the phosphorylated sites has been used as major classification data in most of the existing systems. Since the number of kinase annotations in protein sequences is far less than that of the proteins being sequenced to date, the prediction systems that use the information found from the small clique of kinase annotated proteins can not be considered as completely perfect for predicting outside the clique. Hence the systems are certainly not generalized. In this paper, a novel generalized prediction system, PPRED (Phosphorylation PREDictor) is proposed that ignores the kinase information and only uses the evolutionary information of proteins for classifying phosphorylation sites. RESULTS: Experimental results based on cross validations and an independent benchmark reveal the significance of using the evolutionary information alone to classify phosphorylation sites from protein sequences. The prediction performance of the proposed system is better than those of the existing prediction systems that also do not incorporate kinase information. The system is also comparable to systems that incorporate kinase information in predicting such sites. CONCLUSIONS: The approach presented in this paper provides an efficient way to identify phosphorylation sites in a given protein primary sequence that would be a valuable information for the molecular biologists working on protein phosphorylation sites and for bioinformaticians developing generalized prediction systems for the post translational modifications like phosphorylation or glycosylation. PPRED is publicly available at the URL http://www.cse.univdhaka.edu/~ashis/ppred/index.php. BioMed Central 2010-05-21 /pmc/articles/PMC2887807/ /pubmed/20492656 http://dx.doi.org/10.1186/1471-2105-11-273 Text en Copyright ©2010 Biswas et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research article
Biswas, Ashis Kumer
Noman, Nasimul
Sikder, Abdur Rahman
Machine learning approach to predict protein phosphorylation sites by incorporating evolutionary information
title Machine learning approach to predict protein phosphorylation sites by incorporating evolutionary information
title_full Machine learning approach to predict protein phosphorylation sites by incorporating evolutionary information
title_fullStr Machine learning approach to predict protein phosphorylation sites by incorporating evolutionary information
title_full_unstemmed Machine learning approach to predict protein phosphorylation sites by incorporating evolutionary information
title_short Machine learning approach to predict protein phosphorylation sites by incorporating evolutionary information
title_sort machine learning approach to predict protein phosphorylation sites by incorporating evolutionary information
topic Research article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2887807/
https://www.ncbi.nlm.nih.gov/pubmed/20492656
http://dx.doi.org/10.1186/1471-2105-11-273
work_keys_str_mv AT biswasashiskumer machinelearningapproachtopredictproteinphosphorylationsitesbyincorporatingevolutionaryinformation
AT nomannasimul machinelearningapproachtopredictproteinphosphorylationsitesbyincorporatingevolutionaryinformation
AT sikderabdurrahman machinelearningapproachtopredictproteinphosphorylationsitesbyincorporatingevolutionaryinformation