Cargando…

Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS

Prediction of protein-protein interaction (PPI) sites is one of the most challenging problems in computational biology. Although great progress has been made by employing various machine learning approaches with numerous characteristic features, the problem is still far from being solved. In this st...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Bi-Qing, Feng, Kai-Yan, Chen, Lei, Huang, Tao, Cai, Yu-Dong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3429425/
https://www.ncbi.nlm.nih.gov/pubmed/22937126
http://dx.doi.org/10.1371/journal.pone.0043927
_version_ 1782241791335989248
author Li, Bi-Qing
Feng, Kai-Yan
Chen, Lei
Huang, Tao
Cai, Yu-Dong
author_facet Li, Bi-Qing
Feng, Kai-Yan
Chen, Lei
Huang, Tao
Cai, Yu-Dong
author_sort Li, Bi-Qing
collection PubMed
description Prediction of protein-protein interaction (PPI) sites is one of the most challenging problems in computational biology. Although great progress has been made by employing various machine learning approaches with numerous characteristic features, the problem is still far from being solved. In this study, we developed a novel predictor based on Random Forest (RF) algorithm with the Minimum Redundancy Maximal Relevance (mRMR) method followed by incremental feature selection (IFS). We incorporated features of physicochemical/biochemical properties, sequence conservation, residual disorder, secondary structure and solvent accessibility. We also included five 3D structural features to predict protein-protein interaction sites and achieved an overall accuracy of 0.672997 and MCC of 0.347977. Feature analysis showed that 3D structural features such as Depth Index (DPX) and surface curvature (SC) contributed most to the prediction of protein-protein interaction sites. It was also shown via site-specific feature analysis that the features of individual residues from PPI sites contribute most to the determination of protein-protein interaction sites. It is anticipated that our prediction method will become a useful tool for identifying PPI sites, and that the feature analysis described in this paper will provide useful insights into the mechanisms of interaction.
format Online
Article
Text
id pubmed-3429425
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-34294252012-08-30 Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS Li, Bi-Qing Feng, Kai-Yan Chen, Lei Huang, Tao Cai, Yu-Dong PLoS One Research Article Prediction of protein-protein interaction (PPI) sites is one of the most challenging problems in computational biology. Although great progress has been made by employing various machine learning approaches with numerous characteristic features, the problem is still far from being solved. In this study, we developed a novel predictor based on Random Forest (RF) algorithm with the Minimum Redundancy Maximal Relevance (mRMR) method followed by incremental feature selection (IFS). We incorporated features of physicochemical/biochemical properties, sequence conservation, residual disorder, secondary structure and solvent accessibility. We also included five 3D structural features to predict protein-protein interaction sites and achieved an overall accuracy of 0.672997 and MCC of 0.347977. Feature analysis showed that 3D structural features such as Depth Index (DPX) and surface curvature (SC) contributed most to the prediction of protein-protein interaction sites. It was also shown via site-specific feature analysis that the features of individual residues from PPI sites contribute most to the determination of protein-protein interaction sites. It is anticipated that our prediction method will become a useful tool for identifying PPI sites, and that the feature analysis described in this paper will provide useful insights into the mechanisms of interaction. Public Library of Science 2012-08-28 /pmc/articles/PMC3429425/ /pubmed/22937126 http://dx.doi.org/10.1371/journal.pone.0043927 Text en © 2012 Li et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Li, Bi-Qing
Feng, Kai-Yan
Chen, Lei
Huang, Tao
Cai, Yu-Dong
Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS
title Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS
title_full Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS
title_fullStr Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS
title_full_unstemmed Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS
title_short Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS
title_sort prediction of protein-protein interaction sites by random forest algorithm with mrmr and ifs
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3429425/
https://www.ncbi.nlm.nih.gov/pubmed/22937126
http://dx.doi.org/10.1371/journal.pone.0043927
work_keys_str_mv AT libiqing predictionofproteinproteininteractionsitesbyrandomforestalgorithmwithmrmrandifs
AT fengkaiyan predictionofproteinproteininteractionsitesbyrandomforestalgorithmwithmrmrandifs
AT chenlei predictionofproteinproteininteractionsitesbyrandomforestalgorithmwithmrmrandifs
AT huangtao predictionofproteinproteininteractionsitesbyrandomforestalgorithmwithmrmrandifs
AT caiyudong predictionofproteinproteininteractionsitesbyrandomforestalgorithmwithmrmrandifs