Cargando…
PCLassoLog: A protein complex-based, group Lasso-logistic model for cancer classification and risk protein complex discovery
Risk gene identification has attracted much attention in the past two decades. Since most genes need to be translated into proteins and cooperate with other proteins to form protein complexes to carry out cellular functions, which significantly extends the functional diversity of individual proteins...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Research Network of Computational and Structural Biotechnology
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9791601/ https://www.ncbi.nlm.nih.gov/pubmed/36582441 http://dx.doi.org/10.1016/j.csbj.2022.12.005 |
_version_ | 1784859443735822336 |
---|---|
author | Wang, Wei Yuan, Haiyan Han, Junwei Liu, Wei |
author_facet | Wang, Wei Yuan, Haiyan Han, Junwei Liu, Wei |
author_sort | Wang, Wei |
collection | PubMed |
description | Risk gene identification has attracted much attention in the past two decades. Since most genes need to be translated into proteins and cooperate with other proteins to form protein complexes to carry out cellular functions, which significantly extends the functional diversity of individual proteins, revealing the molecular mechanism of cancer from a comprehensive perspective needs to shift from identifying individual risk genes toward identifying risk protein complexes. Here, we embed protein complexes into the regularized learning framework and propose a protein complex-based, group Lasso-logistic model (PCLassoLog) to discover risk protein complexes. Experiments on deep proteomic data of two cancer types show that PCLassoLog yields superior predictive performance on independent datasets. More importantly, PCLassoLog identifies risk protein complexes that not only contain individual risk proteins but also incorporate close partners that synergize with them. Furthermore, selection probabilities are calculated and two other protein complex-based models are proposed to complement PCLassoLog in identifying reliable risk protein complexes. Based on PCLassoLog, a pan-cancer analysis is performed to identify risk protein complexes in 12 cancer types. Finally, PCLassoLog is used to discover risk protein complexes associated with gene mutation. We implement all protein complex-based models as an R package PCLassoReg, which may serve as an effective tool to discover risk protein complexes in various contexts. |
format | Online Article Text |
id | pubmed-9791601 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Research Network of Computational and Structural Biotechnology |
record_format | MEDLINE/PubMed |
spelling | pubmed-97916012022-12-28 PCLassoLog: A protein complex-based, group Lasso-logistic model for cancer classification and risk protein complex discovery Wang, Wei Yuan, Haiyan Han, Junwei Liu, Wei Comput Struct Biotechnol J Research Article Risk gene identification has attracted much attention in the past two decades. Since most genes need to be translated into proteins and cooperate with other proteins to form protein complexes to carry out cellular functions, which significantly extends the functional diversity of individual proteins, revealing the molecular mechanism of cancer from a comprehensive perspective needs to shift from identifying individual risk genes toward identifying risk protein complexes. Here, we embed protein complexes into the regularized learning framework and propose a protein complex-based, group Lasso-logistic model (PCLassoLog) to discover risk protein complexes. Experiments on deep proteomic data of two cancer types show that PCLassoLog yields superior predictive performance on independent datasets. More importantly, PCLassoLog identifies risk protein complexes that not only contain individual risk proteins but also incorporate close partners that synergize with them. Furthermore, selection probabilities are calculated and two other protein complex-based models are proposed to complement PCLassoLog in identifying reliable risk protein complexes. Based on PCLassoLog, a pan-cancer analysis is performed to identify risk protein complexes in 12 cancer types. Finally, PCLassoLog is used to discover risk protein complexes associated with gene mutation. We implement all protein complex-based models as an R package PCLassoReg, which may serve as an effective tool to discover risk protein complexes in various contexts. Research Network of Computational and Structural Biotechnology 2022-12-06 /pmc/articles/PMC9791601/ /pubmed/36582441 http://dx.doi.org/10.1016/j.csbj.2022.12.005 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Research Article Wang, Wei Yuan, Haiyan Han, Junwei Liu, Wei PCLassoLog: A protein complex-based, group Lasso-logistic model for cancer classification and risk protein complex discovery |
title | PCLassoLog: A protein complex-based, group Lasso-logistic model for cancer classification and risk protein complex discovery |
title_full | PCLassoLog: A protein complex-based, group Lasso-logistic model for cancer classification and risk protein complex discovery |
title_fullStr | PCLassoLog: A protein complex-based, group Lasso-logistic model for cancer classification and risk protein complex discovery |
title_full_unstemmed | PCLassoLog: A protein complex-based, group Lasso-logistic model for cancer classification and risk protein complex discovery |
title_short | PCLassoLog: A protein complex-based, group Lasso-logistic model for cancer classification and risk protein complex discovery |
title_sort | pclassolog: a protein complex-based, group lasso-logistic model for cancer classification and risk protein complex discovery |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9791601/ https://www.ncbi.nlm.nih.gov/pubmed/36582441 http://dx.doi.org/10.1016/j.csbj.2022.12.005 |
work_keys_str_mv | AT wangwei pclassologaproteincomplexbasedgrouplassologisticmodelforcancerclassificationandriskproteincomplexdiscovery AT yuanhaiyan pclassologaproteincomplexbasedgrouplassologisticmodelforcancerclassificationandriskproteincomplexdiscovery AT hanjunwei pclassologaproteincomplexbasedgrouplassologisticmodelforcancerclassificationandriskproteincomplexdiscovery AT liuwei pclassologaproteincomplexbasedgrouplassologisticmodelforcancerclassificationandriskproteincomplexdiscovery |