Cargando…
Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis
BACKGROUND: Protein-protein interactions (PPIs) play crucial roles in the execution of various cellular processes and form the basis of biological mechanisms. Although large amount of PPIs data for different species has been generated by high-throughput experimental techniques, current PPI pairs obt...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3654889/ https://www.ncbi.nlm.nih.gov/pubmed/23815620 http://dx.doi.org/10.1186/1471-2105-14-S8-S10 |
_version_ | 1782269786314506240 |
---|---|
author | You, Zhu-Hong Lei, Ying-Ke Zhu, Lin Xia, Junfeng Wang, Bing |
author_facet | You, Zhu-Hong Lei, Ying-Ke Zhu, Lin Xia, Junfeng Wang, Bing |
author_sort | You, Zhu-Hong |
collection | PubMed |
description | BACKGROUND: Protein-protein interactions (PPIs) play crucial roles in the execution of various cellular processes and form the basis of biological mechanisms. Although large amount of PPIs data for different species has been generated by high-throughput experimental techniques, current PPI pairs obtained with experimental methods cover only a fraction of the complete PPI networks, and further, the experimental methods for identifying PPIs are both time-consuming and expensive. Hence, it is urgent and challenging to develop automated computational methods to efficiently and accurately predict PPIs. RESULTS: We present here a novel hierarchical PCA-EELM (principal component analysis-ensemble extreme learning machine) model to predict protein-protein interactions only using the information of protein sequences. In the proposed method, 11188 protein pairs retrieved from the DIP database were encoded into feature vectors by using four kinds of protein sequences information. Focusing on dimension reduction, an effective feature extraction method PCA was then employed to construct the most discriminative new feature set. Finally, multiple extreme learning machines were trained and then aggregated into a consensus classifier by majority voting. The ensembling of extreme learning machine removes the dependence of results on initial random weights and improves the prediction performance. CONCLUSIONS: When performed on the PPI data of Saccharomyces cerevisiae, the proposed method achieved 87.00% prediction accuracy with 86.15% sensitivity at the precision of 87.59%. Extensive experiments are performed to compare our method with state-of-the-art techniques Support Vector Machine (SVM). Experimental results demonstrate that proposed PCA-EELM outperforms the SVM method by 5-fold cross-validation. Besides, PCA-EELM performs faster than PCA-SVM based method. Consequently, the proposed approach can be considered as a new promising and powerful tools for predicting PPI with excellent performance and less time. |
format | Online Article Text |
id | pubmed-3654889 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-36548892013-05-20 Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis You, Zhu-Hong Lei, Ying-Ke Zhu, Lin Xia, Junfeng Wang, Bing BMC Bioinformatics Proceedings BACKGROUND: Protein-protein interactions (PPIs) play crucial roles in the execution of various cellular processes and form the basis of biological mechanisms. Although large amount of PPIs data for different species has been generated by high-throughput experimental techniques, current PPI pairs obtained with experimental methods cover only a fraction of the complete PPI networks, and further, the experimental methods for identifying PPIs are both time-consuming and expensive. Hence, it is urgent and challenging to develop automated computational methods to efficiently and accurately predict PPIs. RESULTS: We present here a novel hierarchical PCA-EELM (principal component analysis-ensemble extreme learning machine) model to predict protein-protein interactions only using the information of protein sequences. In the proposed method, 11188 protein pairs retrieved from the DIP database were encoded into feature vectors by using four kinds of protein sequences information. Focusing on dimension reduction, an effective feature extraction method PCA was then employed to construct the most discriminative new feature set. Finally, multiple extreme learning machines were trained and then aggregated into a consensus classifier by majority voting. The ensembling of extreme learning machine removes the dependence of results on initial random weights and improves the prediction performance. CONCLUSIONS: When performed on the PPI data of Saccharomyces cerevisiae, the proposed method achieved 87.00% prediction accuracy with 86.15% sensitivity at the precision of 87.59%. Extensive experiments are performed to compare our method with state-of-the-art techniques Support Vector Machine (SVM). Experimental results demonstrate that proposed PCA-EELM outperforms the SVM method by 5-fold cross-validation. Besides, PCA-EELM performs faster than PCA-SVM based method. Consequently, the proposed approach can be considered as a new promising and powerful tools for predicting PPI with excellent performance and less time. BioMed Central 2013-05-09 /pmc/articles/PMC3654889/ /pubmed/23815620 http://dx.doi.org/10.1186/1471-2105-14-S8-S10 Text en Copyright © 2013 You et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings You, Zhu-Hong Lei, Ying-Ke Zhu, Lin Xia, Junfeng Wang, Bing Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis |
title | Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis |
title_full | Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis |
title_fullStr | Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis |
title_full_unstemmed | Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis |
title_short | Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis |
title_sort | prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3654889/ https://www.ncbi.nlm.nih.gov/pubmed/23815620 http://dx.doi.org/10.1186/1471-2105-14-S8-S10 |
work_keys_str_mv | AT youzhuhong predictionofproteinproteininteractionsfromaminoacidsequenceswithensembleextremelearningmachinesandprincipalcomponentanalysis AT leiyingke predictionofproteinproteininteractionsfromaminoacidsequenceswithensembleextremelearningmachinesandprincipalcomponentanalysis AT zhulin predictionofproteinproteininteractionsfromaminoacidsequenceswithensembleextremelearningmachinesandprincipalcomponentanalysis AT xiajunfeng predictionofproteinproteininteractionsfromaminoacidsequenceswithensembleextremelearningmachinesandprincipalcomponentanalysis AT wangbing predictionofproteinproteininteractionsfromaminoacidsequenceswithensembleextremelearningmachinesandprincipalcomponentanalysis |