Cargando…
A Novel Feature Extraction Scheme with Ensemble Coding for Protein–Protein Interaction Prediction
Protein–protein interactions (PPIs) play key roles in most cellular processes, such as cell metabolism, immune response, endocrine function, DNA replication, and transcription regulation. PPI prediction is one of the most challenging problems in functional genomics. Although PPI data have been incre...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4139871/ https://www.ncbi.nlm.nih.gov/pubmed/25046746 http://dx.doi.org/10.3390/ijms150712731 |
_version_ | 1782331430641074176 |
---|---|
author | Du, Xiuquan Cheng, Jiaxing Zheng, Tingting Duan, Zheng Qian, Fulan |
author_facet | Du, Xiuquan Cheng, Jiaxing Zheng, Tingting Duan, Zheng Qian, Fulan |
author_sort | Du, Xiuquan |
collection | PubMed |
description | Protein–protein interactions (PPIs) play key roles in most cellular processes, such as cell metabolism, immune response, endocrine function, DNA replication, and transcription regulation. PPI prediction is one of the most challenging problems in functional genomics. Although PPI data have been increasing because of the development of high-throughput technologies and computational methods, many problems are still far from being solved. In this study, a novel predictor was designed by using the Random Forest (RF) algorithm with the ensemble coding (EC) method. To reduce computational time, a feature selection method (DX) was adopted to rank the features and search the optimal feature combination. The DXEC method integrates many features and physicochemical/biochemical properties to predict PPIs. On the Gold Yeast dataset, the DXEC method achieves 67.2% overall precision, 80.74% recall, and 70.67% accuracy. On the Silver Yeast dataset, the DXEC method achieves 76.93% precision, 77.98% recall, and 77.27% accuracy. On the human dataset, the prediction accuracy reaches 80% for the DXEC-RF method. We extended the experiment to a bigger and more realistic dataset that maintains 50% recall on the Yeast All dataset and 80% recall on the Human All dataset. These results show that the DXEC method is suitable for performing PPI prediction. The prediction service of the DXEC-RF classifier is available at http://ailab.ahu.edu.cn:8087/DXECPPI/index.jsp. |
format | Online Article Text |
id | pubmed-4139871 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-41398712014-08-21 A Novel Feature Extraction Scheme with Ensemble Coding for Protein–Protein Interaction Prediction Du, Xiuquan Cheng, Jiaxing Zheng, Tingting Duan, Zheng Qian, Fulan Int J Mol Sci Article Protein–protein interactions (PPIs) play key roles in most cellular processes, such as cell metabolism, immune response, endocrine function, DNA replication, and transcription regulation. PPI prediction is one of the most challenging problems in functional genomics. Although PPI data have been increasing because of the development of high-throughput technologies and computational methods, many problems are still far from being solved. In this study, a novel predictor was designed by using the Random Forest (RF) algorithm with the ensemble coding (EC) method. To reduce computational time, a feature selection method (DX) was adopted to rank the features and search the optimal feature combination. The DXEC method integrates many features and physicochemical/biochemical properties to predict PPIs. On the Gold Yeast dataset, the DXEC method achieves 67.2% overall precision, 80.74% recall, and 70.67% accuracy. On the Silver Yeast dataset, the DXEC method achieves 76.93% precision, 77.98% recall, and 77.27% accuracy. On the human dataset, the prediction accuracy reaches 80% for the DXEC-RF method. We extended the experiment to a bigger and more realistic dataset that maintains 50% recall on the Yeast All dataset and 80% recall on the Human All dataset. These results show that the DXEC method is suitable for performing PPI prediction. The prediction service of the DXEC-RF classifier is available at http://ailab.ahu.edu.cn:8087/DXECPPI/index.jsp. MDPI 2014-07-18 /pmc/articles/PMC4139871/ /pubmed/25046746 http://dx.doi.org/10.3390/ijms150712731 Text en © 2014 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/). |
spellingShingle | Article Du, Xiuquan Cheng, Jiaxing Zheng, Tingting Duan, Zheng Qian, Fulan A Novel Feature Extraction Scheme with Ensemble Coding for Protein–Protein Interaction Prediction |
title | A Novel Feature Extraction Scheme with Ensemble Coding for Protein–Protein Interaction Prediction |
title_full | A Novel Feature Extraction Scheme with Ensemble Coding for Protein–Protein Interaction Prediction |
title_fullStr | A Novel Feature Extraction Scheme with Ensemble Coding for Protein–Protein Interaction Prediction |
title_full_unstemmed | A Novel Feature Extraction Scheme with Ensemble Coding for Protein–Protein Interaction Prediction |
title_short | A Novel Feature Extraction Scheme with Ensemble Coding for Protein–Protein Interaction Prediction |
title_sort | novel feature extraction scheme with ensemble coding for protein–protein interaction prediction |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4139871/ https://www.ncbi.nlm.nih.gov/pubmed/25046746 http://dx.doi.org/10.3390/ijms150712731 |
work_keys_str_mv | AT duxiuquan anovelfeatureextractionschemewithensemblecodingforproteinproteininteractionprediction AT chengjiaxing anovelfeatureextractionschemewithensemblecodingforproteinproteininteractionprediction AT zhengtingting anovelfeatureextractionschemewithensemblecodingforproteinproteininteractionprediction AT duanzheng anovelfeatureextractionschemewithensemblecodingforproteinproteininteractionprediction AT qianfulan anovelfeatureextractionschemewithensemblecodingforproteinproteininteractionprediction AT duxiuquan novelfeatureextractionschemewithensemblecodingforproteinproteininteractionprediction AT chengjiaxing novelfeatureextractionschemewithensemblecodingforproteinproteininteractionprediction AT zhengtingting novelfeatureextractionschemewithensemblecodingforproteinproteininteractionprediction AT duanzheng novelfeatureextractionschemewithensemblecodingforproteinproteininteractionprediction AT qianfulan novelfeatureextractionschemewithensemblecodingforproteinproteininteractionprediction |