Cargando…
Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer
Tumor heterogeneity and the unclear metastasis mechanisms are the leading cause for the unavailability of effective targeted therapy for Triple-negative breast cancer (TNBC), a breast cancer (BrCa) subtype characterized by high mortality and high frequency of distant metastasis cases. The identifica...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Research Network of Computational and Structural Biotechnology
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9014315/ https://www.ncbi.nlm.nih.gov/pubmed/35465161 http://dx.doi.org/10.1016/j.csbj.2022.03.019 |
_version_ | 1784688181176696832 |
---|---|
author | Thalor, Anamika Kumar Joon, Hemant Singh, Gagandeep Roy, Shikha Gupta, Dinesh |
author_facet | Thalor, Anamika Kumar Joon, Hemant Singh, Gagandeep Roy, Shikha Gupta, Dinesh |
author_sort | Thalor, Anamika |
collection | PubMed |
description | Tumor heterogeneity and the unclear metastasis mechanisms are the leading cause for the unavailability of effective targeted therapy for Triple-negative breast cancer (TNBC), a breast cancer (BrCa) subtype characterized by high mortality and high frequency of distant metastasis cases. The identification of prognostic biomarker can improve prognosis and personalized treatment regimes. Herein, we collected gene expression datasets representing TNBC and Non-TNBC BrCa. From the complete dataset, a subset reflecting solely known cancer driver genes was also constructed. Recursive Feature Elimination (RFE) was employed to identify top 20, 25, 30, 35, 40, 45, and 50 gene signatures that differentiate TNBC from the other BrCa subtypes. Five machine learning algorithms were employed on these selected features and on the basis of model performance evaluation, it was found that for the complete and driver dataset, XGBoost performs the best for a subset of 25 and 20 genes, respectively. Out of these 45 genes from the two datasets, 34 genes were found to be differentially regulated. The Kaplan-Meier (KM) analysis for Distant Metastasis Free Survival (DMFS) of these 34 differentially regulated genes revealed four genes, out of which two are novel that could be potential prognostic genes (POU2AF1 and S100B). Finally, interactome and pathway enrichment analyses were carried out to investigate the functional role of the identified potential prognostic genes in TNBC. These genes are associated with MAPK, PI3-AkT, Wnt, TGF-β, and other signal transduction pathways, pivotal in metastasis cascade. These gene signatures can provide novel molecular-level insights into metastasis. |
format | Online Article Text |
id | pubmed-9014315 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Research Network of Computational and Structural Biotechnology |
record_format | MEDLINE/PubMed |
spelling | pubmed-90143152022-04-21 Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer Thalor, Anamika Kumar Joon, Hemant Singh, Gagandeep Roy, Shikha Gupta, Dinesh Comput Struct Biotechnol J Research Article Tumor heterogeneity and the unclear metastasis mechanisms are the leading cause for the unavailability of effective targeted therapy for Triple-negative breast cancer (TNBC), a breast cancer (BrCa) subtype characterized by high mortality and high frequency of distant metastasis cases. The identification of prognostic biomarker can improve prognosis and personalized treatment regimes. Herein, we collected gene expression datasets representing TNBC and Non-TNBC BrCa. From the complete dataset, a subset reflecting solely known cancer driver genes was also constructed. Recursive Feature Elimination (RFE) was employed to identify top 20, 25, 30, 35, 40, 45, and 50 gene signatures that differentiate TNBC from the other BrCa subtypes. Five machine learning algorithms were employed on these selected features and on the basis of model performance evaluation, it was found that for the complete and driver dataset, XGBoost performs the best for a subset of 25 and 20 genes, respectively. Out of these 45 genes from the two datasets, 34 genes were found to be differentially regulated. The Kaplan-Meier (KM) analysis for Distant Metastasis Free Survival (DMFS) of these 34 differentially regulated genes revealed four genes, out of which two are novel that could be potential prognostic genes (POU2AF1 and S100B). Finally, interactome and pathway enrichment analyses were carried out to investigate the functional role of the identified potential prognostic genes in TNBC. These genes are associated with MAPK, PI3-AkT, Wnt, TGF-β, and other signal transduction pathways, pivotal in metastasis cascade. These gene signatures can provide novel molecular-level insights into metastasis. Research Network of Computational and Structural Biotechnology 2022-03-24 /pmc/articles/PMC9014315/ /pubmed/35465161 http://dx.doi.org/10.1016/j.csbj.2022.03.019 Text en © 2022 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Research Article Thalor, Anamika Kumar Joon, Hemant Singh, Gagandeep Roy, Shikha Gupta, Dinesh Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer |
title | Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer |
title_full | Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer |
title_fullStr | Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer |
title_full_unstemmed | Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer |
title_short | Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer |
title_sort | machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9014315/ https://www.ncbi.nlm.nih.gov/pubmed/35465161 http://dx.doi.org/10.1016/j.csbj.2022.03.019 |
work_keys_str_mv | AT thaloranamika machinelearningassistedanalysisofbreastcancergeneexpressionprofilesrevealsnovelpotentialprognosticbiomarkersfortriplenegativebreastcancer AT kumarjoonhemant machinelearningassistedanalysisofbreastcancergeneexpressionprofilesrevealsnovelpotentialprognosticbiomarkersfortriplenegativebreastcancer AT singhgagandeep machinelearningassistedanalysisofbreastcancergeneexpressionprofilesrevealsnovelpotentialprognosticbiomarkersfortriplenegativebreastcancer AT royshikha machinelearningassistedanalysisofbreastcancergeneexpressionprofilesrevealsnovelpotentialprognosticbiomarkersfortriplenegativebreastcancer AT guptadinesh machinelearningassistedanalysisofbreastcancergeneexpressionprofilesrevealsnovelpotentialprognosticbiomarkersfortriplenegativebreastcancer |