Cargando…

Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer

Tumor heterogeneity and the unclear metastasis mechanisms are the leading cause for the unavailability of effective targeted therapy for Triple-negative breast cancer (TNBC), a breast cancer (BrCa) subtype characterized by high mortality and high frequency of distant metastasis cases. The identifica...

Descripción completa

Detalles Bibliográficos
Autores principales: Thalor, Anamika, Kumar Joon, Hemant, Singh, Gagandeep, Roy, Shikha, Gupta, Dinesh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9014315/
https://www.ncbi.nlm.nih.gov/pubmed/35465161
http://dx.doi.org/10.1016/j.csbj.2022.03.019
_version_ 1784688181176696832
author Thalor, Anamika
Kumar Joon, Hemant
Singh, Gagandeep
Roy, Shikha
Gupta, Dinesh
author_facet Thalor, Anamika
Kumar Joon, Hemant
Singh, Gagandeep
Roy, Shikha
Gupta, Dinesh
author_sort Thalor, Anamika
collection PubMed
description Tumor heterogeneity and the unclear metastasis mechanisms are the leading cause for the unavailability of effective targeted therapy for Triple-negative breast cancer (TNBC), a breast cancer (BrCa) subtype characterized by high mortality and high frequency of distant metastasis cases. The identification of prognostic biomarker can improve prognosis and personalized treatment regimes. Herein, we collected gene expression datasets representing TNBC and Non-TNBC BrCa. From the complete dataset, a subset reflecting solely known cancer driver genes was also constructed. Recursive Feature Elimination (RFE) was employed to identify top 20, 25, 30, 35, 40, 45, and 50 gene signatures that differentiate TNBC from the other BrCa subtypes. Five machine learning algorithms were employed on these selected features and on the basis of model performance evaluation, it was found that for the complete and driver dataset, XGBoost performs the best for a subset of 25 and 20 genes, respectively. Out of these 45 genes from the two datasets, 34 genes were found to be differentially regulated. The Kaplan-Meier (KM) analysis for Distant Metastasis Free Survival (DMFS) of these 34 differentially regulated genes revealed four genes, out of which two are novel that could be potential prognostic genes (POU2AF1 and S100B). Finally, interactome and pathway enrichment analyses were carried out to investigate the functional role of the identified potential prognostic genes in TNBC. These genes are associated with MAPK, PI3-AkT, Wnt, TGF-β, and other signal transduction pathways, pivotal in metastasis cascade. These gene signatures can provide novel molecular-level insights into metastasis.
format Online
Article
Text
id pubmed-9014315
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-90143152022-04-21 Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer Thalor, Anamika Kumar Joon, Hemant Singh, Gagandeep Roy, Shikha Gupta, Dinesh Comput Struct Biotechnol J Research Article Tumor heterogeneity and the unclear metastasis mechanisms are the leading cause for the unavailability of effective targeted therapy for Triple-negative breast cancer (TNBC), a breast cancer (BrCa) subtype characterized by high mortality and high frequency of distant metastasis cases. The identification of prognostic biomarker can improve prognosis and personalized treatment regimes. Herein, we collected gene expression datasets representing TNBC and Non-TNBC BrCa. From the complete dataset, a subset reflecting solely known cancer driver genes was also constructed. Recursive Feature Elimination (RFE) was employed to identify top 20, 25, 30, 35, 40, 45, and 50 gene signatures that differentiate TNBC from the other BrCa subtypes. Five machine learning algorithms were employed on these selected features and on the basis of model performance evaluation, it was found that for the complete and driver dataset, XGBoost performs the best for a subset of 25 and 20 genes, respectively. Out of these 45 genes from the two datasets, 34 genes were found to be differentially regulated. The Kaplan-Meier (KM) analysis for Distant Metastasis Free Survival (DMFS) of these 34 differentially regulated genes revealed four genes, out of which two are novel that could be potential prognostic genes (POU2AF1 and S100B). Finally, interactome and pathway enrichment analyses were carried out to investigate the functional role of the identified potential prognostic genes in TNBC. These genes are associated with MAPK, PI3-AkT, Wnt, TGF-β, and other signal transduction pathways, pivotal in metastasis cascade. These gene signatures can provide novel molecular-level insights into metastasis. Research Network of Computational and Structural Biotechnology 2022-03-24 /pmc/articles/PMC9014315/ /pubmed/35465161 http://dx.doi.org/10.1016/j.csbj.2022.03.019 Text en © 2022 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Research Article
Thalor, Anamika
Kumar Joon, Hemant
Singh, Gagandeep
Roy, Shikha
Gupta, Dinesh
Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer
title Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer
title_full Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer
title_fullStr Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer
title_full_unstemmed Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer
title_short Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer
title_sort machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9014315/
https://www.ncbi.nlm.nih.gov/pubmed/35465161
http://dx.doi.org/10.1016/j.csbj.2022.03.019
work_keys_str_mv AT thaloranamika machinelearningassistedanalysisofbreastcancergeneexpressionprofilesrevealsnovelpotentialprognosticbiomarkersfortriplenegativebreastcancer
AT kumarjoonhemant machinelearningassistedanalysisofbreastcancergeneexpressionprofilesrevealsnovelpotentialprognosticbiomarkersfortriplenegativebreastcancer
AT singhgagandeep machinelearningassistedanalysisofbreastcancergeneexpressionprofilesrevealsnovelpotentialprognosticbiomarkersfortriplenegativebreastcancer
AT royshikha machinelearningassistedanalysisofbreastcancergeneexpressionprofilesrevealsnovelpotentialprognosticbiomarkersfortriplenegativebreastcancer
AT guptadinesh machinelearningassistedanalysisofbreastcancergeneexpressionprofilesrevealsnovelpotentialprognosticbiomarkersfortriplenegativebreastcancer