Cargando…

A Data Science Approach for the Identification of Molecular Signatures of Aggressive Cancers

SIMPLE SUMMARY: Traditionally, chemotherapy has been approached through one-size-fits-all strategies. However, personalized oncology would allow a rational approach to chemotherapies. Classically, cancer diagnosis and prognosis are performed through mutation mapping, but this genomic approach has an...

Descripción completa

Detalles Bibliográficos
Autores principales: Barbosa-Silva, Adriano, Magalhães, Milena, da Silva, Gilberto Ferreira, da Silva, Fabricio Alves Barbosa, Carneiro, Flávia Raquel Gonçalves, Carels, Nicolas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9103663/
https://www.ncbi.nlm.nih.gov/pubmed/35565454
http://dx.doi.org/10.3390/cancers14092325
_version_ 1784707606631153664
author Barbosa-Silva, Adriano
Magalhães, Milena
da Silva, Gilberto Ferreira
da Silva, Fabricio Alves Barbosa
Carneiro, Flávia Raquel Gonçalves
Carels, Nicolas
author_facet Barbosa-Silva, Adriano
Magalhães, Milena
da Silva, Gilberto Ferreira
da Silva, Fabricio Alves Barbosa
Carneiro, Flávia Raquel Gonçalves
Carels, Nicolas
author_sort Barbosa-Silva, Adriano
collection PubMed
description SIMPLE SUMMARY: Traditionally, chemotherapy has been approached through one-size-fits-all strategies. However, personalized oncology would allow a rational approach to chemotherapies. Classically, cancer diagnosis and prognosis are performed through mutation mapping, but this genomic approach has an indirect relationship with the disease since it is based on the results of statistics accumulated over time. By contrast, a strategy based on gene expression would enable figuring out the actual disease phenotype and focusing on its specific molecular targets. In previous reports, we paved the way in that direction by successively showing that targeting up-regulated hubs are a suitable strategy to forward a tumor toward cell death and that the number of proteins to be targeted is typically between 3 and 10 according to tumor aggressiveness. In this report, we focused on the up-regulated genes of crucial cell signaling pathways, which are key hallmarks of unregulated cell division and apoptosis. By principal component analysis, we identified the genes that most explain the aggressiveness among cancer types. We also identified the genes that maximized the classification between aggressive and mild cancers using the random forest algorithm. Finally, by mapping these genes on the human interactome, we showed that they were close neighbors. ABSTRACT: The main hallmarks of cancer include sustaining proliferative signaling and resisting cell death. We analyzed the genes of the WNT pathway and seven cross-linked pathways that may explain the differences in aggressiveness among cancer types. We divided six cancer types (liver, lung, stomach, kidney, prostate, and thyroid) into classes of high (H) and low (L) aggressiveness considering the TCGA data, and their correlations between Shannon entropy and 5-year overall survival (OS). Then, we used principal component analysis (PCA), a random forest classifier (RFC), and protein–protein interactions (PPI) to find the genes that correlated with aggressiveness. Using PCA, we found GRB2, CTNNB1, SKP1, CSNK2A1, PRKDC, HDAC1, YWHAZ, YWHAB, and PSMD2. Except for PSMD2, the RFC analysis showed a different list, which was CAD, PSMD14, APH1A, PSMD2, SHC1, TMEFF2, PSMD11, H2AFZ, PSMB5, and NOTCH1. Both methods use different algorithmic approaches and have different purposes, which explains the discrepancy between the two gene lists. The key genes of aggressiveness found by PCA were those that maximized the separation of H and L classes according to its third component, which represented 19% of the total variance. By contrast, RFC classified whether the RNA-seq of a tumor sample was of the H or L type. Interestingly, PPIs showed that the genes of PCA and RFC lists were connected neighbors in the PPI signaling network of WNT and cross-linked pathways.
format Online
Article
Text
id pubmed-9103663
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-91036632022-05-14 A Data Science Approach for the Identification of Molecular Signatures of Aggressive Cancers Barbosa-Silva, Adriano Magalhães, Milena da Silva, Gilberto Ferreira da Silva, Fabricio Alves Barbosa Carneiro, Flávia Raquel Gonçalves Carels, Nicolas Cancers (Basel) Article SIMPLE SUMMARY: Traditionally, chemotherapy has been approached through one-size-fits-all strategies. However, personalized oncology would allow a rational approach to chemotherapies. Classically, cancer diagnosis and prognosis are performed through mutation mapping, but this genomic approach has an indirect relationship with the disease since it is based on the results of statistics accumulated over time. By contrast, a strategy based on gene expression would enable figuring out the actual disease phenotype and focusing on its specific molecular targets. In previous reports, we paved the way in that direction by successively showing that targeting up-regulated hubs are a suitable strategy to forward a tumor toward cell death and that the number of proteins to be targeted is typically between 3 and 10 according to tumor aggressiveness. In this report, we focused on the up-regulated genes of crucial cell signaling pathways, which are key hallmarks of unregulated cell division and apoptosis. By principal component analysis, we identified the genes that most explain the aggressiveness among cancer types. We also identified the genes that maximized the classification between aggressive and mild cancers using the random forest algorithm. Finally, by mapping these genes on the human interactome, we showed that they were close neighbors. ABSTRACT: The main hallmarks of cancer include sustaining proliferative signaling and resisting cell death. We analyzed the genes of the WNT pathway and seven cross-linked pathways that may explain the differences in aggressiveness among cancer types. We divided six cancer types (liver, lung, stomach, kidney, prostate, and thyroid) into classes of high (H) and low (L) aggressiveness considering the TCGA data, and their correlations between Shannon entropy and 5-year overall survival (OS). Then, we used principal component analysis (PCA), a random forest classifier (RFC), and protein–protein interactions (PPI) to find the genes that correlated with aggressiveness. Using PCA, we found GRB2, CTNNB1, SKP1, CSNK2A1, PRKDC, HDAC1, YWHAZ, YWHAB, and PSMD2. Except for PSMD2, the RFC analysis showed a different list, which was CAD, PSMD14, APH1A, PSMD2, SHC1, TMEFF2, PSMD11, H2AFZ, PSMB5, and NOTCH1. Both methods use different algorithmic approaches and have different purposes, which explains the discrepancy between the two gene lists. The key genes of aggressiveness found by PCA were those that maximized the separation of H and L classes according to its third component, which represented 19% of the total variance. By contrast, RFC classified whether the RNA-seq of a tumor sample was of the H or L type. Interestingly, PPIs showed that the genes of PCA and RFC lists were connected neighbors in the PPI signaling network of WNT and cross-linked pathways. MDPI 2022-05-07 /pmc/articles/PMC9103663/ /pubmed/35565454 http://dx.doi.org/10.3390/cancers14092325 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Barbosa-Silva, Adriano
Magalhães, Milena
da Silva, Gilberto Ferreira
da Silva, Fabricio Alves Barbosa
Carneiro, Flávia Raquel Gonçalves
Carels, Nicolas
A Data Science Approach for the Identification of Molecular Signatures of Aggressive Cancers
title A Data Science Approach for the Identification of Molecular Signatures of Aggressive Cancers
title_full A Data Science Approach for the Identification of Molecular Signatures of Aggressive Cancers
title_fullStr A Data Science Approach for the Identification of Molecular Signatures of Aggressive Cancers
title_full_unstemmed A Data Science Approach for the Identification of Molecular Signatures of Aggressive Cancers
title_short A Data Science Approach for the Identification of Molecular Signatures of Aggressive Cancers
title_sort data science approach for the identification of molecular signatures of aggressive cancers
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9103663/
https://www.ncbi.nlm.nih.gov/pubmed/35565454
http://dx.doi.org/10.3390/cancers14092325
work_keys_str_mv AT barbosasilvaadriano adatascienceapproachfortheidentificationofmolecularsignaturesofaggressivecancers
AT magalhaesmilena adatascienceapproachfortheidentificationofmolecularsignaturesofaggressivecancers
AT dasilvagilbertoferreira adatascienceapproachfortheidentificationofmolecularsignaturesofaggressivecancers
AT dasilvafabricioalvesbarbosa adatascienceapproachfortheidentificationofmolecularsignaturesofaggressivecancers
AT carneiroflaviaraquelgoncalves adatascienceapproachfortheidentificationofmolecularsignaturesofaggressivecancers
AT carelsnicolas adatascienceapproachfortheidentificationofmolecularsignaturesofaggressivecancers
AT barbosasilvaadriano datascienceapproachfortheidentificationofmolecularsignaturesofaggressivecancers
AT magalhaesmilena datascienceapproachfortheidentificationofmolecularsignaturesofaggressivecancers
AT dasilvagilbertoferreira datascienceapproachfortheidentificationofmolecularsignaturesofaggressivecancers
AT dasilvafabricioalvesbarbosa datascienceapproachfortheidentificationofmolecularsignaturesofaggressivecancers
AT carneiroflaviaraquelgoncalves datascienceapproachfortheidentificationofmolecularsignaturesofaggressivecancers
AT carelsnicolas datascienceapproachfortheidentificationofmolecularsignaturesofaggressivecancers