Cargando…
Prediction of Protein Cleavage Site with Feature Selection by Random Forest
Proteinases play critical roles in both intra and extracellular processes by binding and cleaving their protein substrates. The cleavage can either be non-specific as part of degradation during protein catabolism or highly specific as part of proteolytic cascades and signal transduction events. Iden...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3445488/ https://www.ncbi.nlm.nih.gov/pubmed/23029276 http://dx.doi.org/10.1371/journal.pone.0045854 |
_version_ | 1782243821304676352 |
---|---|
author | Li, Bi-Qing Cai, Yu-Dong Feng, Kai-Yan Zhao, Gui-Jun |
author_facet | Li, Bi-Qing Cai, Yu-Dong Feng, Kai-Yan Zhao, Gui-Jun |
author_sort | Li, Bi-Qing |
collection | PubMed |
description | Proteinases play critical roles in both intra and extracellular processes by binding and cleaving their protein substrates. The cleavage can either be non-specific as part of degradation during protein catabolism or highly specific as part of proteolytic cascades and signal transduction events. Identification of these targets is extremely challenging. Current computational approaches for predicting cleavage sites are very limited since they mainly represent the amino acid sequences as patterns or frequency matrices. In this work, we developed a novel predictor based on Random Forest algorithm (RF) using maximum relevance minimum redundancy (mRMR) method followed by incremental feature selection (IFS). The features of physicochemical/biochemical properties, sequence conservation, residual disorder, amino acid occurrence frequency, secondary structure and solvent accessibility were utilized to represent the peptides concerned. Here, we compared existing prediction tools which are available for predicting possible cleavage sites in candidate substrates with ours. It is shown that our method makes much more reliable predictions in terms of the overall prediction accuracy. In addition, this predictor allows the use of a wide range of proteinases. |
format | Online Article Text |
id | pubmed-3445488 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-34454882012-10-01 Prediction of Protein Cleavage Site with Feature Selection by Random Forest Li, Bi-Qing Cai, Yu-Dong Feng, Kai-Yan Zhao, Gui-Jun PLoS One Research Article Proteinases play critical roles in both intra and extracellular processes by binding and cleaving their protein substrates. The cleavage can either be non-specific as part of degradation during protein catabolism or highly specific as part of proteolytic cascades and signal transduction events. Identification of these targets is extremely challenging. Current computational approaches for predicting cleavage sites are very limited since they mainly represent the amino acid sequences as patterns or frequency matrices. In this work, we developed a novel predictor based on Random Forest algorithm (RF) using maximum relevance minimum redundancy (mRMR) method followed by incremental feature selection (IFS). The features of physicochemical/biochemical properties, sequence conservation, residual disorder, amino acid occurrence frequency, secondary structure and solvent accessibility were utilized to represent the peptides concerned. Here, we compared existing prediction tools which are available for predicting possible cleavage sites in candidate substrates with ours. It is shown that our method makes much more reliable predictions in terms of the overall prediction accuracy. In addition, this predictor allows the use of a wide range of proteinases. Public Library of Science 2012-09-18 /pmc/articles/PMC3445488/ /pubmed/23029276 http://dx.doi.org/10.1371/journal.pone.0045854 Text en © 2012 Li et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Li, Bi-Qing Cai, Yu-Dong Feng, Kai-Yan Zhao, Gui-Jun Prediction of Protein Cleavage Site with Feature Selection by Random Forest |
title | Prediction of Protein Cleavage Site with Feature Selection by Random Forest |
title_full | Prediction of Protein Cleavage Site with Feature Selection by Random Forest |
title_fullStr | Prediction of Protein Cleavage Site with Feature Selection by Random Forest |
title_full_unstemmed | Prediction of Protein Cleavage Site with Feature Selection by Random Forest |
title_short | Prediction of Protein Cleavage Site with Feature Selection by Random Forest |
title_sort | prediction of protein cleavage site with feature selection by random forest |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3445488/ https://www.ncbi.nlm.nih.gov/pubmed/23029276 http://dx.doi.org/10.1371/journal.pone.0045854 |
work_keys_str_mv | AT libiqing predictionofproteincleavagesitewithfeatureselectionbyrandomforest AT caiyudong predictionofproteincleavagesitewithfeatureselectionbyrandomforest AT fengkaiyan predictionofproteincleavagesitewithfeatureselectionbyrandomforest AT zhaoguijun predictionofproteincleavagesitewithfeatureselectionbyrandomforest |