Cargando…
Predicting Protein-Protein Interactions via Random Ferns with Evolutionary Matrix Representation
Protein-protein interactions (PPIs) play a crucial role in understanding disease pathogenesis, genetic mechanisms, guiding drug design, and other biochemical processes, thus, the identification of PPIs is of great importance. With the rapid development of high-throughput sequencing technology, a lar...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8888042/ https://www.ncbi.nlm.nih.gov/pubmed/35242211 http://dx.doi.org/10.1155/2022/7191684 |
_version_ | 1784661041863458816 |
---|---|
author | Li, Yang Wang, Zheng You, Zhu-Hong Li, Li-Ping Hu, Xuegang |
author_facet | Li, Yang Wang, Zheng You, Zhu-Hong Li, Li-Ping Hu, Xuegang |
author_sort | Li, Yang |
collection | PubMed |
description | Protein-protein interactions (PPIs) play a crucial role in understanding disease pathogenesis, genetic mechanisms, guiding drug design, and other biochemical processes, thus, the identification of PPIs is of great importance. With the rapid development of high-throughput sequencing technology, a large amount of PPIs sequence data has been accumulated. Researchers have designed many experimental methods to detect PPIs by using these sequence data, hence, the prediction of PPIs has become a research hotspot in proteomics. However, since traditional experimental methods are both time-consuming and costly, it is difficult to analyze and predict the massive amount of PPI data quickly and accurately. To address these issues, many computational systems employing machine learning knowledge were widely applied to PPIs prediction, thereby improving the overall recognition rate. In this paper, a novel and efficient computational technology is presented to implement a protein interaction prediction system using only protein sequence information. First, the Position-Specific Iterated Basic Local Alignment Search Tool (PSI-BLAST) was employed to generate a position-specific scoring matrix (PSSM) containing protein evolutionary information from the initial protein sequence. Second, we used a novel data processing feature representation scheme, MatFLDA, to extract the essential information of PSSM for protein sequences and obtained five training and five testing datasets by adopting a five-fold cross-validation method. Finally, the random fern (RFs) classifier was employed to infer the interactions among proteins, and a model called MatFLDA_RFs was developed. The proposed MatFLDA_RFs model achieved good prediction performance with 95.03% average accuracy on Yeast dataset and 85.35% average accuracy on H. pylori dataset, which effectively outperformed other existing computational methods. The experimental results indicate that the proposed method is capable of yielding better prediction results of PPIs, which provides an effective tool for the detection of new PPIs and the in-depth study of proteomics. Finally, we also developed a web server for the proposed model to predict protein-protein interactions, which is freely accessible online at http://120.77.11.78:5001/webserver/MatFLDA_RFs. |
format | Online Article Text |
id | pubmed-8888042 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-88880422022-03-02 Predicting Protein-Protein Interactions via Random Ferns with Evolutionary Matrix Representation Li, Yang Wang, Zheng You, Zhu-Hong Li, Li-Ping Hu, Xuegang Comput Math Methods Med Research Article Protein-protein interactions (PPIs) play a crucial role in understanding disease pathogenesis, genetic mechanisms, guiding drug design, and other biochemical processes, thus, the identification of PPIs is of great importance. With the rapid development of high-throughput sequencing technology, a large amount of PPIs sequence data has been accumulated. Researchers have designed many experimental methods to detect PPIs by using these sequence data, hence, the prediction of PPIs has become a research hotspot in proteomics. However, since traditional experimental methods are both time-consuming and costly, it is difficult to analyze and predict the massive amount of PPI data quickly and accurately. To address these issues, many computational systems employing machine learning knowledge were widely applied to PPIs prediction, thereby improving the overall recognition rate. In this paper, a novel and efficient computational technology is presented to implement a protein interaction prediction system using only protein sequence information. First, the Position-Specific Iterated Basic Local Alignment Search Tool (PSI-BLAST) was employed to generate a position-specific scoring matrix (PSSM) containing protein evolutionary information from the initial protein sequence. Second, we used a novel data processing feature representation scheme, MatFLDA, to extract the essential information of PSSM for protein sequences and obtained five training and five testing datasets by adopting a five-fold cross-validation method. Finally, the random fern (RFs) classifier was employed to infer the interactions among proteins, and a model called MatFLDA_RFs was developed. The proposed MatFLDA_RFs model achieved good prediction performance with 95.03% average accuracy on Yeast dataset and 85.35% average accuracy on H. pylori dataset, which effectively outperformed other existing computational methods. The experimental results indicate that the proposed method is capable of yielding better prediction results of PPIs, which provides an effective tool for the detection of new PPIs and the in-depth study of proteomics. Finally, we also developed a web server for the proposed model to predict protein-protein interactions, which is freely accessible online at http://120.77.11.78:5001/webserver/MatFLDA_RFs. Hindawi 2022-02-22 /pmc/articles/PMC8888042/ /pubmed/35242211 http://dx.doi.org/10.1155/2022/7191684 Text en Copyright © 2022 Yang Li et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Li, Yang Wang, Zheng You, Zhu-Hong Li, Li-Ping Hu, Xuegang Predicting Protein-Protein Interactions via Random Ferns with Evolutionary Matrix Representation |
title | Predicting Protein-Protein Interactions via Random Ferns with Evolutionary Matrix Representation |
title_full | Predicting Protein-Protein Interactions via Random Ferns with Evolutionary Matrix Representation |
title_fullStr | Predicting Protein-Protein Interactions via Random Ferns with Evolutionary Matrix Representation |
title_full_unstemmed | Predicting Protein-Protein Interactions via Random Ferns with Evolutionary Matrix Representation |
title_short | Predicting Protein-Protein Interactions via Random Ferns with Evolutionary Matrix Representation |
title_sort | predicting protein-protein interactions via random ferns with evolutionary matrix representation |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8888042/ https://www.ncbi.nlm.nih.gov/pubmed/35242211 http://dx.doi.org/10.1155/2022/7191684 |
work_keys_str_mv | AT liyang predictingproteinproteininteractionsviarandomfernswithevolutionarymatrixrepresentation AT wangzheng predictingproteinproteininteractionsviarandomfernswithevolutionarymatrixrepresentation AT youzhuhong predictingproteinproteininteractionsviarandomfernswithevolutionarymatrixrepresentation AT liliping predictingproteinproteininteractionsviarandomfernswithevolutionarymatrixrepresentation AT huxuegang predictingproteinproteininteractionsviarandomfernswithevolutionarymatrixrepresentation |