Cargando…

Molecular Classification Models for Triple Negative Breast Cancer Subtype Using Machine Learning

Triple negative breast cancer (TNBC) lacks well-defined molecular targets and is highly heterogenous, making treatment challenging. Using gene expression analysis, TNBC has been classified into four different subtypes: basal-like immune-activated (BLIA), basal-like immune-suppressed (BLIS), mesenchy...

Descripción completa

Detalles Bibliográficos
Autores principales: Bissanum, Rassanee, Chaichulee, Sitthichok, Kamolphiwong, Rawikant, Navakanitworakul, Raphatphorn, Kanokwiroon, Kanyanatt
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8472680/
https://www.ncbi.nlm.nih.gov/pubmed/34575658
http://dx.doi.org/10.3390/jpm11090881
_version_ 1784574796422447104
author Bissanum, Rassanee
Chaichulee, Sitthichok
Kamolphiwong, Rawikant
Navakanitworakul, Raphatphorn
Kanokwiroon, Kanyanatt
author_facet Bissanum, Rassanee
Chaichulee, Sitthichok
Kamolphiwong, Rawikant
Navakanitworakul, Raphatphorn
Kanokwiroon, Kanyanatt
author_sort Bissanum, Rassanee
collection PubMed
description Triple negative breast cancer (TNBC) lacks well-defined molecular targets and is highly heterogenous, making treatment challenging. Using gene expression analysis, TNBC has been classified into four different subtypes: basal-like immune-activated (BLIA), basal-like immune-suppressed (BLIS), mesenchymal (MES), and luminal androgen receptor (LAR). However, there is currently no standardized method for classifying TNBC subtypes. We attempted to define a gene signature for each subtype, and to develop a classification method based on machine learning (ML) for TNBC subtyping. In these experiments, gene expression microarray data for TNBC patients were downloaded from the Gene Expression Omnibus database. Differentially expressed genes unique to 198 known TNBC cases were identified and selected as a training gene set to train in seven different classification models. We produced a training set consisting of 719 DEGs selected from uniquely expressed genes of all four subtypes. The highest average accuracy of classification of the BLIA, BLIS, MES, and LAR subtypes was achieved by the SVM algorithm (accuracy 95–98.8%; AUC 0.99–1.00). For model validation, we used 334 samples of unknown TNBC subtypes, of which 97 (29.04%), 73 (21.86%), 39 (11.68%) and 59 (17.66%) were predicted to be BLIA, BLIS, MES, and LAR, respectively. However, 66 TNBC samples (19.76%) could not be assigned to any subtype. These samples contained only three upregulated genes (EN1, PROM1, and CCL2). Each TNBC subtype had a unique gene expression pattern, which was confirmed by identification of DEGs and pathway analysis. These results indicated that our training gene set was suitable for development of classification models, and that the SVM algorithm could classify TNBC into four unique subtypes. Accurate and consistent classification of the TNBC subtypes is essential for personalized treatment and prognosis of TNBC.
format Online
Article
Text
id pubmed-8472680
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-84726802021-09-28 Molecular Classification Models for Triple Negative Breast Cancer Subtype Using Machine Learning Bissanum, Rassanee Chaichulee, Sitthichok Kamolphiwong, Rawikant Navakanitworakul, Raphatphorn Kanokwiroon, Kanyanatt J Pers Med Article Triple negative breast cancer (TNBC) lacks well-defined molecular targets and is highly heterogenous, making treatment challenging. Using gene expression analysis, TNBC has been classified into four different subtypes: basal-like immune-activated (BLIA), basal-like immune-suppressed (BLIS), mesenchymal (MES), and luminal androgen receptor (LAR). However, there is currently no standardized method for classifying TNBC subtypes. We attempted to define a gene signature for each subtype, and to develop a classification method based on machine learning (ML) for TNBC subtyping. In these experiments, gene expression microarray data for TNBC patients were downloaded from the Gene Expression Omnibus database. Differentially expressed genes unique to 198 known TNBC cases were identified and selected as a training gene set to train in seven different classification models. We produced a training set consisting of 719 DEGs selected from uniquely expressed genes of all four subtypes. The highest average accuracy of classification of the BLIA, BLIS, MES, and LAR subtypes was achieved by the SVM algorithm (accuracy 95–98.8%; AUC 0.99–1.00). For model validation, we used 334 samples of unknown TNBC subtypes, of which 97 (29.04%), 73 (21.86%), 39 (11.68%) and 59 (17.66%) were predicted to be BLIA, BLIS, MES, and LAR, respectively. However, 66 TNBC samples (19.76%) could not be assigned to any subtype. These samples contained only three upregulated genes (EN1, PROM1, and CCL2). Each TNBC subtype had a unique gene expression pattern, which was confirmed by identification of DEGs and pathway analysis. These results indicated that our training gene set was suitable for development of classification models, and that the SVM algorithm could classify TNBC into four unique subtypes. Accurate and consistent classification of the TNBC subtypes is essential for personalized treatment and prognosis of TNBC. MDPI 2021-09-01 /pmc/articles/PMC8472680/ /pubmed/34575658 http://dx.doi.org/10.3390/jpm11090881 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Bissanum, Rassanee
Chaichulee, Sitthichok
Kamolphiwong, Rawikant
Navakanitworakul, Raphatphorn
Kanokwiroon, Kanyanatt
Molecular Classification Models for Triple Negative Breast Cancer Subtype Using Machine Learning
title Molecular Classification Models for Triple Negative Breast Cancer Subtype Using Machine Learning
title_full Molecular Classification Models for Triple Negative Breast Cancer Subtype Using Machine Learning
title_fullStr Molecular Classification Models for Triple Negative Breast Cancer Subtype Using Machine Learning
title_full_unstemmed Molecular Classification Models for Triple Negative Breast Cancer Subtype Using Machine Learning
title_short Molecular Classification Models for Triple Negative Breast Cancer Subtype Using Machine Learning
title_sort molecular classification models for triple negative breast cancer subtype using machine learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8472680/
https://www.ncbi.nlm.nih.gov/pubmed/34575658
http://dx.doi.org/10.3390/jpm11090881
work_keys_str_mv AT bissanumrassanee molecularclassificationmodelsfortriplenegativebreastcancersubtypeusingmachinelearning
AT chaichuleesitthichok molecularclassificationmodelsfortriplenegativebreastcancersubtypeusingmachinelearning
AT kamolphiwongrawikant molecularclassificationmodelsfortriplenegativebreastcancersubtypeusingmachinelearning
AT navakanitworakulraphatphorn molecularclassificationmodelsfortriplenegativebreastcancersubtypeusingmachinelearning
AT kanokwiroonkanyanatt molecularclassificationmodelsfortriplenegativebreastcancersubtypeusingmachinelearning