Cargando…

An Ensemble Learning Framework for Detecting Protein Complexes From PPI Networks

Detecting protein complexes is one of the keys to understanding cellular organization and processes principles. With high-throughput experiments and computing science development, it has become possible to detect protein complexes by computational methods. However, most computational methods are bas...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Rongquan, Ma, Huimin, Wang, Caixia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8908451/
https://www.ncbi.nlm.nih.gov/pubmed/35281831
http://dx.doi.org/10.3389/fgene.2022.839949
_version_ 1784665880341250048
author Wang, Rongquan
Ma, Huimin
Wang, Caixia
author_facet Wang, Rongquan
Ma, Huimin
Wang, Caixia
author_sort Wang, Rongquan
collection PubMed
description Detecting protein complexes is one of the keys to understanding cellular organization and processes principles. With high-throughput experiments and computing science development, it has become possible to detect protein complexes by computational methods. However, most computational methods are based on either unsupervised learning or supervised learning. Unsupervised learning-based methods do not need training datasets, but they can only detect one or several topological protein complexes. Supervised learning-based methods can detect protein complexes with different topological structures. However, they are usually based on a type of training model, and the generalization of a single model is poor. Therefore, we propose an Ensemble Learning Framework for Detecting Protein Complexes (ELF-DPC) within protein-protein interaction (PPI) networks to address these challenges. The ELF-DPC first constructs the weighted PPI network by combining topological and biological information. Second, it mines protein complex cores using the protein complex core mining strategy we designed. Third, it obtains an ensemble learning model by integrating structural modularity and a trained voting regressor model. Finally, it extends the protein complex cores and forms protein complexes by a graph heuristic search strategy. The experimental results demonstrate that ELF-DPC performs better than the twelve state-of-the-art approaches. Moreover, functional enrichment analysis illustrated that ELF-DPC could detect biologically meaningful protein complexes. The code/dataset is available for free download from https://github.com/RongquanWang/ELF-DPC.
format Online
Article
Text
id pubmed-8908451
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-89084512022-03-11 An Ensemble Learning Framework for Detecting Protein Complexes From PPI Networks Wang, Rongquan Ma, Huimin Wang, Caixia Front Genet Genetics Detecting protein complexes is one of the keys to understanding cellular organization and processes principles. With high-throughput experiments and computing science development, it has become possible to detect protein complexes by computational methods. However, most computational methods are based on either unsupervised learning or supervised learning. Unsupervised learning-based methods do not need training datasets, but they can only detect one or several topological protein complexes. Supervised learning-based methods can detect protein complexes with different topological structures. However, they are usually based on a type of training model, and the generalization of a single model is poor. Therefore, we propose an Ensemble Learning Framework for Detecting Protein Complexes (ELF-DPC) within protein-protein interaction (PPI) networks to address these challenges. The ELF-DPC first constructs the weighted PPI network by combining topological and biological information. Second, it mines protein complex cores using the protein complex core mining strategy we designed. Third, it obtains an ensemble learning model by integrating structural modularity and a trained voting regressor model. Finally, it extends the protein complex cores and forms protein complexes by a graph heuristic search strategy. The experimental results demonstrate that ELF-DPC performs better than the twelve state-of-the-art approaches. Moreover, functional enrichment analysis illustrated that ELF-DPC could detect biologically meaningful protein complexes. The code/dataset is available for free download from https://github.com/RongquanWang/ELF-DPC. Frontiers Media S.A. 2022-02-24 /pmc/articles/PMC8908451/ /pubmed/35281831 http://dx.doi.org/10.3389/fgene.2022.839949 Text en Copyright © 2022 Wang, Ma and Wang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Wang, Rongquan
Ma, Huimin
Wang, Caixia
An Ensemble Learning Framework for Detecting Protein Complexes From PPI Networks
title An Ensemble Learning Framework for Detecting Protein Complexes From PPI Networks
title_full An Ensemble Learning Framework for Detecting Protein Complexes From PPI Networks
title_fullStr An Ensemble Learning Framework for Detecting Protein Complexes From PPI Networks
title_full_unstemmed An Ensemble Learning Framework for Detecting Protein Complexes From PPI Networks
title_short An Ensemble Learning Framework for Detecting Protein Complexes From PPI Networks
title_sort ensemble learning framework for detecting protein complexes from ppi networks
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8908451/
https://www.ncbi.nlm.nih.gov/pubmed/35281831
http://dx.doi.org/10.3389/fgene.2022.839949
work_keys_str_mv AT wangrongquan anensemblelearningframeworkfordetectingproteincomplexesfromppinetworks
AT mahuimin anensemblelearningframeworkfordetectingproteincomplexesfromppinetworks
AT wangcaixia anensemblelearningframeworkfordetectingproteincomplexesfromppinetworks
AT wangrongquan ensemblelearningframeworkfordetectingproteincomplexesfromppinetworks
AT mahuimin ensemblelearningframeworkfordetectingproteincomplexesfromppinetworks
AT wangcaixia ensemblelearningframeworkfordetectingproteincomplexesfromppinetworks