Cargando…

Improved protein-protein interactions prediction via weighted sparse representation model combining continuous wavelet descriptor and PseAA composition

BACKGROUND: Protein-protein interactions (PPIs) are essential to most biological processes. Since bioscience has entered into the era of genome and proteome, there is a growing demand for the knowledge about PPI network. High-throughput biological technologies can be used to identify new PPIs, but t...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Yu-An, You, Zhu-Hong, Chen, Xing, Yan, Gui-Ying
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5260127/
https://www.ncbi.nlm.nih.gov/pubmed/28155718
http://dx.doi.org/10.1186/s12918-016-0360-6
_version_ 1782499349149777920
author Huang, Yu-An
You, Zhu-Hong
Chen, Xing
Yan, Gui-Ying
author_facet Huang, Yu-An
You, Zhu-Hong
Chen, Xing
Yan, Gui-Ying
author_sort Huang, Yu-An
collection PubMed
description BACKGROUND: Protein-protein interactions (PPIs) are essential to most biological processes. Since bioscience has entered into the era of genome and proteome, there is a growing demand for the knowledge about PPI network. High-throughput biological technologies can be used to identify new PPIs, but they are expensive, time-consuming, and tedious. Therefore, computational methods for predicting PPIs have an important role. For the past years, an increasing number of computational methods such as protein structure-based approaches have been proposed for predicting PPIs. The major limitation in principle of these methods lies in the prior information of the protein to infer PPIs. Therefore, it is of much significance to develop computational methods which only use the information of protein amino acids sequence. RESULTS: Here, we report a highly efficient approach for predicting PPIs. The main improvements come from the use of a novel protein sequence representation by combining continuous wavelet descriptor and Chou’s pseudo amino acid composition (PseAAC), and from adopting weighted sparse representation based classifier (WSRC). This method, cross-validated on the PPIs datasets of Saccharomyces cerevisiae, Human and H. pylori, achieves an excellent results with accuracies as high as 92.50%, 95.54% and 84.28% respectively, significantly better than previously proposed methods. Extensive experiments are performed to compare the proposed method with state-of-the-art Support Vector Machine (SVM) classifier. CONCLUSIONS: The outstanding results yield by our model that the proposed feature extraction method combing two kinds of descriptors have strong expression ability and are expected to provide comprehensive and effective information for machine learning-based classification models. In addition, the prediction performance in the comparison experiments shows the well cooperation between the combined feature and WSRC. Thus, the proposed method is a very efficient method to predict PPIs and may be a useful supplementary tool for future proteomics studies.
format Online
Article
Text
id pubmed-5260127
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-52601272017-01-30 Improved protein-protein interactions prediction via weighted sparse representation model combining continuous wavelet descriptor and PseAA composition Huang, Yu-An You, Zhu-Hong Chen, Xing Yan, Gui-Ying BMC Syst Biol Research BACKGROUND: Protein-protein interactions (PPIs) are essential to most biological processes. Since bioscience has entered into the era of genome and proteome, there is a growing demand for the knowledge about PPI network. High-throughput biological technologies can be used to identify new PPIs, but they are expensive, time-consuming, and tedious. Therefore, computational methods for predicting PPIs have an important role. For the past years, an increasing number of computational methods such as protein structure-based approaches have been proposed for predicting PPIs. The major limitation in principle of these methods lies in the prior information of the protein to infer PPIs. Therefore, it is of much significance to develop computational methods which only use the information of protein amino acids sequence. RESULTS: Here, we report a highly efficient approach for predicting PPIs. The main improvements come from the use of a novel protein sequence representation by combining continuous wavelet descriptor and Chou’s pseudo amino acid composition (PseAAC), and from adopting weighted sparse representation based classifier (WSRC). This method, cross-validated on the PPIs datasets of Saccharomyces cerevisiae, Human and H. pylori, achieves an excellent results with accuracies as high as 92.50%, 95.54% and 84.28% respectively, significantly better than previously proposed methods. Extensive experiments are performed to compare the proposed method with state-of-the-art Support Vector Machine (SVM) classifier. CONCLUSIONS: The outstanding results yield by our model that the proposed feature extraction method combing two kinds of descriptors have strong expression ability and are expected to provide comprehensive and effective information for machine learning-based classification models. In addition, the prediction performance in the comparison experiments shows the well cooperation between the combined feature and WSRC. Thus, the proposed method is a very efficient method to predict PPIs and may be a useful supplementary tool for future proteomics studies. BioMed Central 2016-12-23 /pmc/articles/PMC5260127/ /pubmed/28155718 http://dx.doi.org/10.1186/s12918-016-0360-6 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Huang, Yu-An
You, Zhu-Hong
Chen, Xing
Yan, Gui-Ying
Improved protein-protein interactions prediction via weighted sparse representation model combining continuous wavelet descriptor and PseAA composition
title Improved protein-protein interactions prediction via weighted sparse representation model combining continuous wavelet descriptor and PseAA composition
title_full Improved protein-protein interactions prediction via weighted sparse representation model combining continuous wavelet descriptor and PseAA composition
title_fullStr Improved protein-protein interactions prediction via weighted sparse representation model combining continuous wavelet descriptor and PseAA composition
title_full_unstemmed Improved protein-protein interactions prediction via weighted sparse representation model combining continuous wavelet descriptor and PseAA composition
title_short Improved protein-protein interactions prediction via weighted sparse representation model combining continuous wavelet descriptor and PseAA composition
title_sort improved protein-protein interactions prediction via weighted sparse representation model combining continuous wavelet descriptor and pseaa composition
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5260127/
https://www.ncbi.nlm.nih.gov/pubmed/28155718
http://dx.doi.org/10.1186/s12918-016-0360-6
work_keys_str_mv AT huangyuan improvedproteinproteininteractionspredictionviaweightedsparserepresentationmodelcombiningcontinuouswaveletdescriptorandpseaacomposition
AT youzhuhong improvedproteinproteininteractionspredictionviaweightedsparserepresentationmodelcombiningcontinuouswaveletdescriptorandpseaacomposition
AT chenxing improvedproteinproteininteractionspredictionviaweightedsparserepresentationmodelcombiningcontinuouswaveletdescriptorandpseaacomposition
AT yanguiying improvedproteinproteininteractionspredictionviaweightedsparserepresentationmodelcombiningcontinuouswaveletdescriptorandpseaacomposition