Cargando…

Sequence-based prediction of protein-protein interactions using weighted sparse representation model combined with global encoding

BACKGROUND: Proteins are the important molecules which participate in virtually every aspect of cellular function within an organism in pairs. Although high-throughput technologies have generated considerable protein-protein interactions (PPIs) data for various species, the processes of experimental...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Yu-An, You, Zhu-Hong, Chen, Xing, Chan, Keith, Luo, Xin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4845433/
https://www.ncbi.nlm.nih.gov/pubmed/27112932
http://dx.doi.org/10.1186/s12859-016-1035-4
_version_ 1782428944364994560
author Huang, Yu-An
You, Zhu-Hong
Chen, Xing
Chan, Keith
Luo, Xin
author_facet Huang, Yu-An
You, Zhu-Hong
Chen, Xing
Chan, Keith
Luo, Xin
author_sort Huang, Yu-An
collection PubMed
description BACKGROUND: Proteins are the important molecules which participate in virtually every aspect of cellular function within an organism in pairs. Although high-throughput technologies have generated considerable protein-protein interactions (PPIs) data for various species, the processes of experimental methods are both time-consuming and expensive. In addition, they are usually associated with high rates of both false positive and false negative results. Accordingly, a number of computational approaches have been developed to effectively and accurately predict protein interactions. However, most of these methods typically perform worse when other biological data sources (e.g., protein structure information, protein domains, or gene neighborhoods information) are not available. Therefore, it is very urgent to develop effective computational methods for prediction of PPIs solely using protein sequence information. RESULTS: In this study, we present a novel computational model combining weighted sparse representation based classifier (WSRC) and global encoding (GE) of amino acid sequence. Two kinds of protein descriptors, composition and transition, are extracted for representing each protein sequence. On the basis of such a feature representation, novel weighted sparse representation based classifier is introduced to predict protein interaction class. When the proposed method was evaluated with the PPIs data of S. cerevisiae, Human and H. pylori, it achieved high prediction accuracies of 96.82, 97.66 and 92.83 % respectively. Extensive experiments were performed for cross-species PPIs prediction and the prediction accuracies were also very promising. CONCLUSIONS: To further evaluate the performance of the proposed method, we then compared its performance with the method based on support vector machine (SVM). The results show that the proposed method achieved a significant improvement. Thus, the proposed method is a very efficient method to predict PPIs and may be a useful supplementary tool for future proteomics studies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1035-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4845433
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-48454332016-04-27 Sequence-based prediction of protein-protein interactions using weighted sparse representation model combined with global encoding Huang, Yu-An You, Zhu-Hong Chen, Xing Chan, Keith Luo, Xin BMC Bioinformatics Research Article BACKGROUND: Proteins are the important molecules which participate in virtually every aspect of cellular function within an organism in pairs. Although high-throughput technologies have generated considerable protein-protein interactions (PPIs) data for various species, the processes of experimental methods are both time-consuming and expensive. In addition, they are usually associated with high rates of both false positive and false negative results. Accordingly, a number of computational approaches have been developed to effectively and accurately predict protein interactions. However, most of these methods typically perform worse when other biological data sources (e.g., protein structure information, protein domains, or gene neighborhoods information) are not available. Therefore, it is very urgent to develop effective computational methods for prediction of PPIs solely using protein sequence information. RESULTS: In this study, we present a novel computational model combining weighted sparse representation based classifier (WSRC) and global encoding (GE) of amino acid sequence. Two kinds of protein descriptors, composition and transition, are extracted for representing each protein sequence. On the basis of such a feature representation, novel weighted sparse representation based classifier is introduced to predict protein interaction class. When the proposed method was evaluated with the PPIs data of S. cerevisiae, Human and H. pylori, it achieved high prediction accuracies of 96.82, 97.66 and 92.83 % respectively. Extensive experiments were performed for cross-species PPIs prediction and the prediction accuracies were also very promising. CONCLUSIONS: To further evaluate the performance of the proposed method, we then compared its performance with the method based on support vector machine (SVM). The results show that the proposed method achieved a significant improvement. Thus, the proposed method is a very efficient method to predict PPIs and may be a useful supplementary tool for future proteomics studies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1035-4) contains supplementary material, which is available to authorized users. BioMed Central 2016-04-26 /pmc/articles/PMC4845433/ /pubmed/27112932 http://dx.doi.org/10.1186/s12859-016-1035-4 Text en © Huang et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Huang, Yu-An
You, Zhu-Hong
Chen, Xing
Chan, Keith
Luo, Xin
Sequence-based prediction of protein-protein interactions using weighted sparse representation model combined with global encoding
title Sequence-based prediction of protein-protein interactions using weighted sparse representation model combined with global encoding
title_full Sequence-based prediction of protein-protein interactions using weighted sparse representation model combined with global encoding
title_fullStr Sequence-based prediction of protein-protein interactions using weighted sparse representation model combined with global encoding
title_full_unstemmed Sequence-based prediction of protein-protein interactions using weighted sparse representation model combined with global encoding
title_short Sequence-based prediction of protein-protein interactions using weighted sparse representation model combined with global encoding
title_sort sequence-based prediction of protein-protein interactions using weighted sparse representation model combined with global encoding
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4845433/
https://www.ncbi.nlm.nih.gov/pubmed/27112932
http://dx.doi.org/10.1186/s12859-016-1035-4
work_keys_str_mv AT huangyuan sequencebasedpredictionofproteinproteininteractionsusingweightedsparserepresentationmodelcombinedwithglobalencoding
AT youzhuhong sequencebasedpredictionofproteinproteininteractionsusingweightedsparserepresentationmodelcombinedwithglobalencoding
AT chenxing sequencebasedpredictionofproteinproteininteractionsusingweightedsparserepresentationmodelcombinedwithglobalencoding
AT chankeith sequencebasedpredictionofproteinproteininteractionsusingweightedsparserepresentationmodelcombinedwithglobalencoding
AT luoxin sequencebasedpredictionofproteinproteininteractionsusingweightedsparserepresentationmodelcombinedwithglobalencoding