Cargando…

A Machine Learning Model to Predict the Triple Negative Breast Cancer Immune Subtype

BACKGROUND: Immune checkpoint blockade (ICB) has been approved for the treatment of triple-negative breast cancer (TNBC), since it significantly improved the progression-free survival (PFS). However, only about 10% of TNBC patients could achieve the complete response (CR) to ICB because of the low r...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Zihao, Wang, Maoli, De Wilde, Rudy Leon, Feng, Ruifa, Su, Mingqiang, Torres-de la Roche, Luz Angela, Shi, Wenjie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8484710/
https://www.ncbi.nlm.nih.gov/pubmed/34603338
http://dx.doi.org/10.3389/fimmu.2021.749459
_version_ 1784577378308063232
author Chen, Zihao
Wang, Maoli
De Wilde, Rudy Leon
Feng, Ruifa
Su, Mingqiang
Torres-de la Roche, Luz Angela
Shi, Wenjie
author_facet Chen, Zihao
Wang, Maoli
De Wilde, Rudy Leon
Feng, Ruifa
Su, Mingqiang
Torres-de la Roche, Luz Angela
Shi, Wenjie
author_sort Chen, Zihao
collection PubMed
description BACKGROUND: Immune checkpoint blockade (ICB) has been approved for the treatment of triple-negative breast cancer (TNBC), since it significantly improved the progression-free survival (PFS). However, only about 10% of TNBC patients could achieve the complete response (CR) to ICB because of the low response rate and potential adverse reactions to ICB. METHODS: Open datasets from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) were downloaded to perform an unsupervised clustering analysis to identify the immune subtype according to the expression profiles. The prognosis, enriched pathways, and the ICB indicators were compared between immune subtypes. Afterward, samples from the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) dataset were used to validate the correlation of immune subtype with prognosis. Data from patients who received ICB were selected to validate the correlation of the immune subtype with ICB response. Machine learning models were used to build a visual web server to predict the immune subtype of TNBC patients requiring ICB. RESULTS: A total of eight open datasets including 931 TNBC samples were used for the unsupervised clustering. Two novel immune subtypes (referred to as S1 and S2) were identified among TNBC patients. Compared with S2, S1 was associated with higher immune scores, higher levels of immune cells, and a better prognosis for immunotherapy. In the validation dataset, subtype 1 samples had a better prognosis than sub type 2 samples, no matter in overall survival (OS) (p = 0.00036) or relapse-free survival (RFS) (p = 0.0022). Bioinformatics analysis identified 11 hub genes (LCK, IL2RG, CD3G, STAT1, CD247, IL2RB, CD3D, IRF1, OAS2, IRF4, and IFNG) related to the immune subtype. A robust machine learning model based on random forest algorithm was established by 11 hub genes, and it performed reasonably well with area Under the Curve of the receiver operating characteristic (AUC) values = 0.76. An open and free web server based on the random forest model, named as triple-negative breast cancer immune subtype (TNBCIS), was developed and is available from https://immunotypes.shinyapps.io/TNBCIS/. CONCLUSION: TNBC open datasets allowed us to stratify samples into distinct immunotherapy response subgroups according to gene expression profiles. Based on two novel subtypes, candidates for ICB with a higher response rate and better prognosis could be selected by using the free visual online web server that we designed.
format Online
Article
Text
id pubmed-8484710
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-84847102021-10-02 A Machine Learning Model to Predict the Triple Negative Breast Cancer Immune Subtype Chen, Zihao Wang, Maoli De Wilde, Rudy Leon Feng, Ruifa Su, Mingqiang Torres-de la Roche, Luz Angela Shi, Wenjie Front Immunol Immunology BACKGROUND: Immune checkpoint blockade (ICB) has been approved for the treatment of triple-negative breast cancer (TNBC), since it significantly improved the progression-free survival (PFS). However, only about 10% of TNBC patients could achieve the complete response (CR) to ICB because of the low response rate and potential adverse reactions to ICB. METHODS: Open datasets from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) were downloaded to perform an unsupervised clustering analysis to identify the immune subtype according to the expression profiles. The prognosis, enriched pathways, and the ICB indicators were compared between immune subtypes. Afterward, samples from the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) dataset were used to validate the correlation of immune subtype with prognosis. Data from patients who received ICB were selected to validate the correlation of the immune subtype with ICB response. Machine learning models were used to build a visual web server to predict the immune subtype of TNBC patients requiring ICB. RESULTS: A total of eight open datasets including 931 TNBC samples were used for the unsupervised clustering. Two novel immune subtypes (referred to as S1 and S2) were identified among TNBC patients. Compared with S2, S1 was associated with higher immune scores, higher levels of immune cells, and a better prognosis for immunotherapy. In the validation dataset, subtype 1 samples had a better prognosis than sub type 2 samples, no matter in overall survival (OS) (p = 0.00036) or relapse-free survival (RFS) (p = 0.0022). Bioinformatics analysis identified 11 hub genes (LCK, IL2RG, CD3G, STAT1, CD247, IL2RB, CD3D, IRF1, OAS2, IRF4, and IFNG) related to the immune subtype. A robust machine learning model based on random forest algorithm was established by 11 hub genes, and it performed reasonably well with area Under the Curve of the receiver operating characteristic (AUC) values = 0.76. An open and free web server based on the random forest model, named as triple-negative breast cancer immune subtype (TNBCIS), was developed and is available from https://immunotypes.shinyapps.io/TNBCIS/. CONCLUSION: TNBC open datasets allowed us to stratify samples into distinct immunotherapy response subgroups according to gene expression profiles. Based on two novel subtypes, candidates for ICB with a higher response rate and better prognosis could be selected by using the free visual online web server that we designed. Frontiers Media S.A. 2021-09-17 /pmc/articles/PMC8484710/ /pubmed/34603338 http://dx.doi.org/10.3389/fimmu.2021.749459 Text en Copyright © 2021 Chen, Wang, De Wilde, Feng, Su, Torres-de la Roche and Shi https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Immunology
Chen, Zihao
Wang, Maoli
De Wilde, Rudy Leon
Feng, Ruifa
Su, Mingqiang
Torres-de la Roche, Luz Angela
Shi, Wenjie
A Machine Learning Model to Predict the Triple Negative Breast Cancer Immune Subtype
title A Machine Learning Model to Predict the Triple Negative Breast Cancer Immune Subtype
title_full A Machine Learning Model to Predict the Triple Negative Breast Cancer Immune Subtype
title_fullStr A Machine Learning Model to Predict the Triple Negative Breast Cancer Immune Subtype
title_full_unstemmed A Machine Learning Model to Predict the Triple Negative Breast Cancer Immune Subtype
title_short A Machine Learning Model to Predict the Triple Negative Breast Cancer Immune Subtype
title_sort machine learning model to predict the triple negative breast cancer immune subtype
topic Immunology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8484710/
https://www.ncbi.nlm.nih.gov/pubmed/34603338
http://dx.doi.org/10.3389/fimmu.2021.749459
work_keys_str_mv AT chenzihao amachinelearningmodeltopredictthetriplenegativebreastcancerimmunesubtype
AT wangmaoli amachinelearningmodeltopredictthetriplenegativebreastcancerimmunesubtype
AT dewilderudyleon amachinelearningmodeltopredictthetriplenegativebreastcancerimmunesubtype
AT fengruifa amachinelearningmodeltopredictthetriplenegativebreastcancerimmunesubtype
AT sumingqiang amachinelearningmodeltopredictthetriplenegativebreastcancerimmunesubtype
AT torresdelarocheluzangela amachinelearningmodeltopredictthetriplenegativebreastcancerimmunesubtype
AT shiwenjie amachinelearningmodeltopredictthetriplenegativebreastcancerimmunesubtype
AT chenzihao machinelearningmodeltopredictthetriplenegativebreastcancerimmunesubtype
AT wangmaoli machinelearningmodeltopredictthetriplenegativebreastcancerimmunesubtype
AT dewilderudyleon machinelearningmodeltopredictthetriplenegativebreastcancerimmunesubtype
AT fengruifa machinelearningmodeltopredictthetriplenegativebreastcancerimmunesubtype
AT sumingqiang machinelearningmodeltopredictthetriplenegativebreastcancerimmunesubtype
AT torresdelarocheluzangela machinelearningmodeltopredictthetriplenegativebreastcancerimmunesubtype
AT shiwenjie machinelearningmodeltopredictthetriplenegativebreastcancerimmunesubtype