Cargando…

Improving protein–protein interactions prediction accuracy using protein evolutionary information and relevance vector machine model

Predicting protein–protein interactions (PPIs) is a challenging task and essential to construct the protein interaction networks, which is important for facilitating our understanding of the mechanisms of biological systems. Although a number of high‐throughput technologies have been proposed to pre...

Descripción completa

Detalles Bibliográficos
Autores principales: An, Ji‐Yong, Meng, Fan‐Rong, You, Zhu‐Hong, Chen, Xing, Yan, Gui‐Ying, Hu, Ji‐Pu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5029537/
https://www.ncbi.nlm.nih.gov/pubmed/27452983
http://dx.doi.org/10.1002/pro.2991
_version_ 1782454531011903488
author An, Ji‐Yong
Meng, Fan‐Rong
You, Zhu‐Hong
Chen, Xing
Yan, Gui‐Ying
Hu, Ji‐Pu
author_facet An, Ji‐Yong
Meng, Fan‐Rong
You, Zhu‐Hong
Chen, Xing
Yan, Gui‐Ying
Hu, Ji‐Pu
author_sort An, Ji‐Yong
collection PubMed
description Predicting protein–protein interactions (PPIs) is a challenging task and essential to construct the protein interaction networks, which is important for facilitating our understanding of the mechanisms of biological systems. Although a number of high‐throughput technologies have been proposed to predict PPIs, there are unavoidable shortcomings, including high cost, time intensity, and inherently high false positive rates. For these reasons, many computational methods have been proposed for predicting PPIs. However, the problem is still far from being solved. In this article, we propose a novel computational method called RVM‐BiGP that combines the relevance vector machine (RVM) model and Bi‐gram Probabilities (BiGP) for PPIs detection from protein sequences. The major improvement includes (1) Protein sequences are represented using the Bi‐gram probabilities (BiGP) feature representation on a Position Specific Scoring Matrix (PSSM), in which the protein evolutionary information is contained; (2) For reducing the influence of noise, the Principal Component Analysis (PCA) method is used to reduce the dimension of BiGP vector; (3) The powerful and robust Relevance Vector Machine (RVM) algorithm is used for classification. Five‐fold cross‐validation experiments executed on yeast and Helicobacter pylori datasets, which achieved very high accuracies of 94.57 and 90.57%, respectively. Experimental results are significantly better than previous methods. To further evaluate the proposed method, we compare it with the state‐of‐the‐art support vector machine (SVM) classifier on the yeast dataset. The experimental results demonstrate that our RVM‐BiGP method is significantly better than the SVM‐based method. In addition, we achieved 97.15% accuracy on imbalance yeast dataset, which is higher than that of balance yeast dataset. The promising experimental results show the efficiency and robust of the proposed method, which can be an automatic decision support tool for future proteomics research. For facilitating extensive studies for future proteomics research, we developed a freely available web server called RVM‐BiGP‐PPIs in Hypertext Preprocessor (PHP) for predicting PPIs. The web server including source code and the datasets are available at http://219.219.62.123:8888/BiGP/.
format Online
Article
Text
id pubmed-5029537
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-50295372016-09-26 Improving protein–protein interactions prediction accuracy using protein evolutionary information and relevance vector machine model An, Ji‐Yong Meng, Fan‐Rong You, Zhu‐Hong Chen, Xing Yan, Gui‐Ying Hu, Ji‐Pu Protein Sci Articles Predicting protein–protein interactions (PPIs) is a challenging task and essential to construct the protein interaction networks, which is important for facilitating our understanding of the mechanisms of biological systems. Although a number of high‐throughput technologies have been proposed to predict PPIs, there are unavoidable shortcomings, including high cost, time intensity, and inherently high false positive rates. For these reasons, many computational methods have been proposed for predicting PPIs. However, the problem is still far from being solved. In this article, we propose a novel computational method called RVM‐BiGP that combines the relevance vector machine (RVM) model and Bi‐gram Probabilities (BiGP) for PPIs detection from protein sequences. The major improvement includes (1) Protein sequences are represented using the Bi‐gram probabilities (BiGP) feature representation on a Position Specific Scoring Matrix (PSSM), in which the protein evolutionary information is contained; (2) For reducing the influence of noise, the Principal Component Analysis (PCA) method is used to reduce the dimension of BiGP vector; (3) The powerful and robust Relevance Vector Machine (RVM) algorithm is used for classification. Five‐fold cross‐validation experiments executed on yeast and Helicobacter pylori datasets, which achieved very high accuracies of 94.57 and 90.57%, respectively. Experimental results are significantly better than previous methods. To further evaluate the proposed method, we compare it with the state‐of‐the‐art support vector machine (SVM) classifier on the yeast dataset. The experimental results demonstrate that our RVM‐BiGP method is significantly better than the SVM‐based method. In addition, we achieved 97.15% accuracy on imbalance yeast dataset, which is higher than that of balance yeast dataset. The promising experimental results show the efficiency and robust of the proposed method, which can be an automatic decision support tool for future proteomics research. For facilitating extensive studies for future proteomics research, we developed a freely available web server called RVM‐BiGP‐PPIs in Hypertext Preprocessor (PHP) for predicting PPIs. The web server including source code and the datasets are available at http://219.219.62.123:8888/BiGP/. John Wiley and Sons Inc. 2016-08-09 2016-10 /pmc/articles/PMC5029537/ /pubmed/27452983 http://dx.doi.org/10.1002/pro.2991 Text en © 2016 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial‐NoDerivs (http://creativecommons.org/licenses/by-nc-nd/4.0/) License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.
spellingShingle Articles
An, Ji‐Yong
Meng, Fan‐Rong
You, Zhu‐Hong
Chen, Xing
Yan, Gui‐Ying
Hu, Ji‐Pu
Improving protein–protein interactions prediction accuracy using protein evolutionary information and relevance vector machine model
title Improving protein–protein interactions prediction accuracy using protein evolutionary information and relevance vector machine model
title_full Improving protein–protein interactions prediction accuracy using protein evolutionary information and relevance vector machine model
title_fullStr Improving protein–protein interactions prediction accuracy using protein evolutionary information and relevance vector machine model
title_full_unstemmed Improving protein–protein interactions prediction accuracy using protein evolutionary information and relevance vector machine model
title_short Improving protein–protein interactions prediction accuracy using protein evolutionary information and relevance vector machine model
title_sort improving protein–protein interactions prediction accuracy using protein evolutionary information and relevance vector machine model
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5029537/
https://www.ncbi.nlm.nih.gov/pubmed/27452983
http://dx.doi.org/10.1002/pro.2991
work_keys_str_mv AT anjiyong improvingproteinproteininteractionspredictionaccuracyusingproteinevolutionaryinformationandrelevancevectormachinemodel
AT mengfanrong improvingproteinproteininteractionspredictionaccuracyusingproteinevolutionaryinformationandrelevancevectormachinemodel
AT youzhuhong improvingproteinproteininteractionspredictionaccuracyusingproteinevolutionaryinformationandrelevancevectormachinemodel
AT chenxing improvingproteinproteininteractionspredictionaccuracyusingproteinevolutionaryinformationandrelevancevectormachinemodel
AT yanguiying improvingproteinproteininteractionspredictionaccuracyusingproteinevolutionaryinformationandrelevancevectormachinemodel
AT hujipu improvingproteinproteininteractionspredictionaccuracyusingproteinevolutionaryinformationandrelevancevectormachinemodel