Cargando…

Prediction of Protein Cleavage Site with Feature Selection by Random Forest

Proteinases play critical roles in both intra and extracellular processes by binding and cleaving their protein substrates. The cleavage can either be non-specific as part of degradation during protein catabolism or highly specific as part of proteolytic cascades and signal transduction events. Iden...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Bi-Qing, Cai, Yu-Dong, Feng, Kai-Yan, Zhao, Gui-Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3445488/
https://www.ncbi.nlm.nih.gov/pubmed/23029276
http://dx.doi.org/10.1371/journal.pone.0045854
_version_ 1782243821304676352
author Li, Bi-Qing
Cai, Yu-Dong
Feng, Kai-Yan
Zhao, Gui-Jun
author_facet Li, Bi-Qing
Cai, Yu-Dong
Feng, Kai-Yan
Zhao, Gui-Jun
author_sort Li, Bi-Qing
collection PubMed
description Proteinases play critical roles in both intra and extracellular processes by binding and cleaving their protein substrates. The cleavage can either be non-specific as part of degradation during protein catabolism or highly specific as part of proteolytic cascades and signal transduction events. Identification of these targets is extremely challenging. Current computational approaches for predicting cleavage sites are very limited since they mainly represent the amino acid sequences as patterns or frequency matrices. In this work, we developed a novel predictor based on Random Forest algorithm (RF) using maximum relevance minimum redundancy (mRMR) method followed by incremental feature selection (IFS). The features of physicochemical/biochemical properties, sequence conservation, residual disorder, amino acid occurrence frequency, secondary structure and solvent accessibility were utilized to represent the peptides concerned. Here, we compared existing prediction tools which are available for predicting possible cleavage sites in candidate substrates with ours. It is shown that our method makes much more reliable predictions in terms of the overall prediction accuracy. In addition, this predictor allows the use of a wide range of proteinases.
format Online
Article
Text
id pubmed-3445488
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-34454882012-10-01 Prediction of Protein Cleavage Site with Feature Selection by Random Forest Li, Bi-Qing Cai, Yu-Dong Feng, Kai-Yan Zhao, Gui-Jun PLoS One Research Article Proteinases play critical roles in both intra and extracellular processes by binding and cleaving their protein substrates. The cleavage can either be non-specific as part of degradation during protein catabolism or highly specific as part of proteolytic cascades and signal transduction events. Identification of these targets is extremely challenging. Current computational approaches for predicting cleavage sites are very limited since they mainly represent the amino acid sequences as patterns or frequency matrices. In this work, we developed a novel predictor based on Random Forest algorithm (RF) using maximum relevance minimum redundancy (mRMR) method followed by incremental feature selection (IFS). The features of physicochemical/biochemical properties, sequence conservation, residual disorder, amino acid occurrence frequency, secondary structure and solvent accessibility were utilized to represent the peptides concerned. Here, we compared existing prediction tools which are available for predicting possible cleavage sites in candidate substrates with ours. It is shown that our method makes much more reliable predictions in terms of the overall prediction accuracy. In addition, this predictor allows the use of a wide range of proteinases. Public Library of Science 2012-09-18 /pmc/articles/PMC3445488/ /pubmed/23029276 http://dx.doi.org/10.1371/journal.pone.0045854 Text en © 2012 Li et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Li, Bi-Qing
Cai, Yu-Dong
Feng, Kai-Yan
Zhao, Gui-Jun
Prediction of Protein Cleavage Site with Feature Selection by Random Forest
title Prediction of Protein Cleavage Site with Feature Selection by Random Forest
title_full Prediction of Protein Cleavage Site with Feature Selection by Random Forest
title_fullStr Prediction of Protein Cleavage Site with Feature Selection by Random Forest
title_full_unstemmed Prediction of Protein Cleavage Site with Feature Selection by Random Forest
title_short Prediction of Protein Cleavage Site with Feature Selection by Random Forest
title_sort prediction of protein cleavage site with feature selection by random forest
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3445488/
https://www.ncbi.nlm.nih.gov/pubmed/23029276
http://dx.doi.org/10.1371/journal.pone.0045854
work_keys_str_mv AT libiqing predictionofproteincleavagesitewithfeatureselectionbyrandomforest
AT caiyudong predictionofproteincleavagesitewithfeatureselectionbyrandomforest
AT fengkaiyan predictionofproteincleavagesitewithfeatureselectionbyrandomforest
AT zhaoguijun predictionofproteincleavagesitewithfeatureselectionbyrandomforest