Cargando…

A Novel Feature Extraction Scheme with Ensemble Coding for Protein–Protein Interaction Prediction

Protein–protein interactions (PPIs) play key roles in most cellular processes, such as cell metabolism, immune response, endocrine function, DNA replication, and transcription regulation. PPI prediction is one of the most challenging problems in functional genomics. Although PPI data have been incre...

Descripción completa

Detalles Bibliográficos
Autores principales: Du, Xiuquan, Cheng, Jiaxing, Zheng, Tingting, Duan, Zheng, Qian, Fulan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4139871/
https://www.ncbi.nlm.nih.gov/pubmed/25046746
http://dx.doi.org/10.3390/ijms150712731
_version_ 1782331430641074176
author Du, Xiuquan
Cheng, Jiaxing
Zheng, Tingting
Duan, Zheng
Qian, Fulan
author_facet Du, Xiuquan
Cheng, Jiaxing
Zheng, Tingting
Duan, Zheng
Qian, Fulan
author_sort Du, Xiuquan
collection PubMed
description Protein–protein interactions (PPIs) play key roles in most cellular processes, such as cell metabolism, immune response, endocrine function, DNA replication, and transcription regulation. PPI prediction is one of the most challenging problems in functional genomics. Although PPI data have been increasing because of the development of high-throughput technologies and computational methods, many problems are still far from being solved. In this study, a novel predictor was designed by using the Random Forest (RF) algorithm with the ensemble coding (EC) method. To reduce computational time, a feature selection method (DX) was adopted to rank the features and search the optimal feature combination. The DXEC method integrates many features and physicochemical/biochemical properties to predict PPIs. On the Gold Yeast dataset, the DXEC method achieves 67.2% overall precision, 80.74% recall, and 70.67% accuracy. On the Silver Yeast dataset, the DXEC method achieves 76.93% precision, 77.98% recall, and 77.27% accuracy. On the human dataset, the prediction accuracy reaches 80% for the DXEC-RF method. We extended the experiment to a bigger and more realistic dataset that maintains 50% recall on the Yeast All dataset and 80% recall on the Human All dataset. These results show that the DXEC method is suitable for performing PPI prediction. The prediction service of the DXEC-RF classifier is available at http://ailab.ahu.edu.cn:8087/DXECPPI/index.jsp.
format Online
Article
Text
id pubmed-4139871
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-41398712014-08-21 A Novel Feature Extraction Scheme with Ensemble Coding for Protein–Protein Interaction Prediction Du, Xiuquan Cheng, Jiaxing Zheng, Tingting Duan, Zheng Qian, Fulan Int J Mol Sci Article Protein–protein interactions (PPIs) play key roles in most cellular processes, such as cell metabolism, immune response, endocrine function, DNA replication, and transcription regulation. PPI prediction is one of the most challenging problems in functional genomics. Although PPI data have been increasing because of the development of high-throughput technologies and computational methods, many problems are still far from being solved. In this study, a novel predictor was designed by using the Random Forest (RF) algorithm with the ensemble coding (EC) method. To reduce computational time, a feature selection method (DX) was adopted to rank the features and search the optimal feature combination. The DXEC method integrates many features and physicochemical/biochemical properties to predict PPIs. On the Gold Yeast dataset, the DXEC method achieves 67.2% overall precision, 80.74% recall, and 70.67% accuracy. On the Silver Yeast dataset, the DXEC method achieves 76.93% precision, 77.98% recall, and 77.27% accuracy. On the human dataset, the prediction accuracy reaches 80% for the DXEC-RF method. We extended the experiment to a bigger and more realistic dataset that maintains 50% recall on the Yeast All dataset and 80% recall on the Human All dataset. These results show that the DXEC method is suitable for performing PPI prediction. The prediction service of the DXEC-RF classifier is available at http://ailab.ahu.edu.cn:8087/DXECPPI/index.jsp. MDPI 2014-07-18 /pmc/articles/PMC4139871/ /pubmed/25046746 http://dx.doi.org/10.3390/ijms150712731 Text en © 2014 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).
spellingShingle Article
Du, Xiuquan
Cheng, Jiaxing
Zheng, Tingting
Duan, Zheng
Qian, Fulan
A Novel Feature Extraction Scheme with Ensemble Coding for Protein–Protein Interaction Prediction
title A Novel Feature Extraction Scheme with Ensemble Coding for Protein–Protein Interaction Prediction
title_full A Novel Feature Extraction Scheme with Ensemble Coding for Protein–Protein Interaction Prediction
title_fullStr A Novel Feature Extraction Scheme with Ensemble Coding for Protein–Protein Interaction Prediction
title_full_unstemmed A Novel Feature Extraction Scheme with Ensemble Coding for Protein–Protein Interaction Prediction
title_short A Novel Feature Extraction Scheme with Ensemble Coding for Protein–Protein Interaction Prediction
title_sort novel feature extraction scheme with ensemble coding for protein–protein interaction prediction
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4139871/
https://www.ncbi.nlm.nih.gov/pubmed/25046746
http://dx.doi.org/10.3390/ijms150712731
work_keys_str_mv AT duxiuquan anovelfeatureextractionschemewithensemblecodingforproteinproteininteractionprediction
AT chengjiaxing anovelfeatureextractionschemewithensemblecodingforproteinproteininteractionprediction
AT zhengtingting anovelfeatureextractionschemewithensemblecodingforproteinproteininteractionprediction
AT duanzheng anovelfeatureextractionschemewithensemblecodingforproteinproteininteractionprediction
AT qianfulan anovelfeatureextractionschemewithensemblecodingforproteinproteininteractionprediction
AT duxiuquan novelfeatureextractionschemewithensemblecodingforproteinproteininteractionprediction
AT chengjiaxing novelfeatureextractionschemewithensemblecodingforproteinproteininteractionprediction
AT zhengtingting novelfeatureextractionschemewithensemblecodingforproteinproteininteractionprediction
AT duanzheng novelfeatureextractionschemewithensemblecodingforproteinproteininteractionprediction
AT qianfulan novelfeatureextractionschemewithensemblecodingforproteinproteininteractionprediction