Cargando…
Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS
Prediction of protein-protein interaction (PPI) sites is one of the most challenging problems in computational biology. Although great progress has been made by employing various machine learning approaches with numerous characteristic features, the problem is still far from being solved. In this st...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3429425/ https://www.ncbi.nlm.nih.gov/pubmed/22937126 http://dx.doi.org/10.1371/journal.pone.0043927 |
_version_ | 1782241791335989248 |
---|---|
author | Li, Bi-Qing Feng, Kai-Yan Chen, Lei Huang, Tao Cai, Yu-Dong |
author_facet | Li, Bi-Qing Feng, Kai-Yan Chen, Lei Huang, Tao Cai, Yu-Dong |
author_sort | Li, Bi-Qing |
collection | PubMed |
description | Prediction of protein-protein interaction (PPI) sites is one of the most challenging problems in computational biology. Although great progress has been made by employing various machine learning approaches with numerous characteristic features, the problem is still far from being solved. In this study, we developed a novel predictor based on Random Forest (RF) algorithm with the Minimum Redundancy Maximal Relevance (mRMR) method followed by incremental feature selection (IFS). We incorporated features of physicochemical/biochemical properties, sequence conservation, residual disorder, secondary structure and solvent accessibility. We also included five 3D structural features to predict protein-protein interaction sites and achieved an overall accuracy of 0.672997 and MCC of 0.347977. Feature analysis showed that 3D structural features such as Depth Index (DPX) and surface curvature (SC) contributed most to the prediction of protein-protein interaction sites. It was also shown via site-specific feature analysis that the features of individual residues from PPI sites contribute most to the determination of protein-protein interaction sites. It is anticipated that our prediction method will become a useful tool for identifying PPI sites, and that the feature analysis described in this paper will provide useful insights into the mechanisms of interaction. |
format | Online Article Text |
id | pubmed-3429425 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-34294252012-08-30 Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS Li, Bi-Qing Feng, Kai-Yan Chen, Lei Huang, Tao Cai, Yu-Dong PLoS One Research Article Prediction of protein-protein interaction (PPI) sites is one of the most challenging problems in computational biology. Although great progress has been made by employing various machine learning approaches with numerous characteristic features, the problem is still far from being solved. In this study, we developed a novel predictor based on Random Forest (RF) algorithm with the Minimum Redundancy Maximal Relevance (mRMR) method followed by incremental feature selection (IFS). We incorporated features of physicochemical/biochemical properties, sequence conservation, residual disorder, secondary structure and solvent accessibility. We also included five 3D structural features to predict protein-protein interaction sites and achieved an overall accuracy of 0.672997 and MCC of 0.347977. Feature analysis showed that 3D structural features such as Depth Index (DPX) and surface curvature (SC) contributed most to the prediction of protein-protein interaction sites. It was also shown via site-specific feature analysis that the features of individual residues from PPI sites contribute most to the determination of protein-protein interaction sites. It is anticipated that our prediction method will become a useful tool for identifying PPI sites, and that the feature analysis described in this paper will provide useful insights into the mechanisms of interaction. Public Library of Science 2012-08-28 /pmc/articles/PMC3429425/ /pubmed/22937126 http://dx.doi.org/10.1371/journal.pone.0043927 Text en © 2012 Li et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Li, Bi-Qing Feng, Kai-Yan Chen, Lei Huang, Tao Cai, Yu-Dong Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS |
title | Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS |
title_full | Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS |
title_fullStr | Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS |
title_full_unstemmed | Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS |
title_short | Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS |
title_sort | prediction of protein-protein interaction sites by random forest algorithm with mrmr and ifs |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3429425/ https://www.ncbi.nlm.nih.gov/pubmed/22937126 http://dx.doi.org/10.1371/journal.pone.0043927 |
work_keys_str_mv | AT libiqing predictionofproteinproteininteractionsitesbyrandomforestalgorithmwithmrmrandifs AT fengkaiyan predictionofproteinproteininteractionsitesbyrandomforestalgorithmwithmrmrandifs AT chenlei predictionofproteinproteininteractionsitesbyrandomforestalgorithmwithmrmrandifs AT huangtao predictionofproteinproteininteractionsitesbyrandomforestalgorithmwithmrmrandifs AT caiyudong predictionofproteinproteininteractionsitesbyrandomforestalgorithmwithmrmrandifs |