Cargando…
A supervised protein complex prediction method with network representation learning and gene ontology knowledge
BACKGROUND: Protein complexes are essential for biologists to understand cell organization and function effectively. In recent years, predicting complexes from protein–protein interaction (PPI) networks through computational methods is one of the current research hotspots. Many methods for protein c...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9317086/ https://www.ncbi.nlm.nih.gov/pubmed/35879648 http://dx.doi.org/10.1186/s12859-022-04850-4 |
_version_ | 1784754971878621184 |
---|---|
author | Wang, Xiaoxu Zhang, Yijia Zhou, Peixuan Liu, Xiaoxia |
author_facet | Wang, Xiaoxu Zhang, Yijia Zhou, Peixuan Liu, Xiaoxia |
author_sort | Wang, Xiaoxu |
collection | PubMed |
description | BACKGROUND: Protein complexes are essential for biologists to understand cell organization and function effectively. In recent years, predicting complexes from protein–protein interaction (PPI) networks through computational methods is one of the current research hotspots. Many methods for protein complex prediction have been proposed. However, how to use the information of known protein complexes is still a fundamental problem that needs to be solved urgently in predicting protein complexes. RESULTS: To solve these problems, we propose a supervised learning method based on network representation learning and gene ontology knowledge, which can fully use the information of known protein complexes to predict new protein complexes. This method first constructs a weighted PPI network based on gene ontology knowledge and topology information, reducing the network's noise problem. On this basis, the topological information of known protein complexes is extracted as features, and the supervised learning model SVCC is obtained according to the feature training. At the same time, the SVCC model is used to predict candidate protein complexes from the protein interaction network. Then, we use the network representation learning method to obtain the vector representation of the protein complex and train the random forest model. Finally, we use the random forest model to classify the candidate protein complexes to obtain the final predicted protein complexes. We evaluate the performance of the proposed method on two publicly PPI data sets. CONCLUSIONS: Experimental results show that our method can effectively improve the performance of protein complex recognition compared with existing methods. In addition, we also analyze the biological significance of protein complexes predicted by our method and other methods. The results show that the protein complexes predicted by our method have high biological significance. |
format | Online Article Text |
id | pubmed-9317086 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-93170862022-07-27 A supervised protein complex prediction method with network representation learning and gene ontology knowledge Wang, Xiaoxu Zhang, Yijia Zhou, Peixuan Liu, Xiaoxia BMC Bioinformatics Research BACKGROUND: Protein complexes are essential for biologists to understand cell organization and function effectively. In recent years, predicting complexes from protein–protein interaction (PPI) networks through computational methods is one of the current research hotspots. Many methods for protein complex prediction have been proposed. However, how to use the information of known protein complexes is still a fundamental problem that needs to be solved urgently in predicting protein complexes. RESULTS: To solve these problems, we propose a supervised learning method based on network representation learning and gene ontology knowledge, which can fully use the information of known protein complexes to predict new protein complexes. This method first constructs a weighted PPI network based on gene ontology knowledge and topology information, reducing the network's noise problem. On this basis, the topological information of known protein complexes is extracted as features, and the supervised learning model SVCC is obtained according to the feature training. At the same time, the SVCC model is used to predict candidate protein complexes from the protein interaction network. Then, we use the network representation learning method to obtain the vector representation of the protein complex and train the random forest model. Finally, we use the random forest model to classify the candidate protein complexes to obtain the final predicted protein complexes. We evaluate the performance of the proposed method on two publicly PPI data sets. CONCLUSIONS: Experimental results show that our method can effectively improve the performance of protein complex recognition compared with existing methods. In addition, we also analyze the biological significance of protein complexes predicted by our method and other methods. The results show that the protein complexes predicted by our method have high biological significance. BioMed Central 2022-07-25 /pmc/articles/PMC9317086/ /pubmed/35879648 http://dx.doi.org/10.1186/s12859-022-04850-4 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Wang, Xiaoxu Zhang, Yijia Zhou, Peixuan Liu, Xiaoxia A supervised protein complex prediction method with network representation learning and gene ontology knowledge |
title | A supervised protein complex prediction method with network representation learning and gene ontology knowledge |
title_full | A supervised protein complex prediction method with network representation learning and gene ontology knowledge |
title_fullStr | A supervised protein complex prediction method with network representation learning and gene ontology knowledge |
title_full_unstemmed | A supervised protein complex prediction method with network representation learning and gene ontology knowledge |
title_short | A supervised protein complex prediction method with network representation learning and gene ontology knowledge |
title_sort | supervised protein complex prediction method with network representation learning and gene ontology knowledge |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9317086/ https://www.ncbi.nlm.nih.gov/pubmed/35879648 http://dx.doi.org/10.1186/s12859-022-04850-4 |
work_keys_str_mv | AT wangxiaoxu asupervisedproteincomplexpredictionmethodwithnetworkrepresentationlearningandgeneontologyknowledge AT zhangyijia asupervisedproteincomplexpredictionmethodwithnetworkrepresentationlearningandgeneontologyknowledge AT zhoupeixuan asupervisedproteincomplexpredictionmethodwithnetworkrepresentationlearningandgeneontologyknowledge AT liuxiaoxia asupervisedproteincomplexpredictionmethodwithnetworkrepresentationlearningandgeneontologyknowledge AT wangxiaoxu supervisedproteincomplexpredictionmethodwithnetworkrepresentationlearningandgeneontologyknowledge AT zhangyijia supervisedproteincomplexpredictionmethodwithnetworkrepresentationlearningandgeneontologyknowledge AT zhoupeixuan supervisedproteincomplexpredictionmethodwithnetworkrepresentationlearningandgeneontologyknowledge AT liuxiaoxia supervisedproteincomplexpredictionmethodwithnetworkrepresentationlearningandgeneontologyknowledge |