Cargando…
Identification of a minimum number of genes to predict triple-negative breast cancer subgroups from gene expression profiles
BACKGROUND: Triple-negative breast cancer (TNBC) is a very heterogeneous disease. Several gene expression and mutation profiling approaches were used to classify it, and all converged to the identification of distinct molecular subtypes, with some overlapping across different approaches. However, a...
Autores principales: | , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9764480/ https://www.ncbi.nlm.nih.gov/pubmed/36536459 http://dx.doi.org/10.1186/s40246-022-00436-6 |
_version_ | 1784853280918077440 |
---|---|
author | Akhouayri, Laila Ostano, Paola Mello-Grand, Maurizia Gregnanin, Ilaria Crivelli, Francesca Laurora, Sara Liscia, Daniele Leone, Francesco Santoro, Angela Mulè, Antonino Guarino, Donatella Maggiore, Claudia Carlino, Angela Magno, Stefano Scatolini, Maria Di Leone, Alba Masetti, Riccardo Chiorino, Giovanna |
author_facet | Akhouayri, Laila Ostano, Paola Mello-Grand, Maurizia Gregnanin, Ilaria Crivelli, Francesca Laurora, Sara Liscia, Daniele Leone, Francesco Santoro, Angela Mulè, Antonino Guarino, Donatella Maggiore, Claudia Carlino, Angela Magno, Stefano Scatolini, Maria Di Leone, Alba Masetti, Riccardo Chiorino, Giovanna |
author_sort | Akhouayri, Laila |
collection | PubMed |
description | BACKGROUND: Triple-negative breast cancer (TNBC) is a very heterogeneous disease. Several gene expression and mutation profiling approaches were used to classify it, and all converged to the identification of distinct molecular subtypes, with some overlapping across different approaches. However, a standardised tool to routinely classify TNBC in the clinics and guide personalised treatment is lacking. We aimed at defining a specific gene signature for each of the six TNBC subtypes proposed by Lehman et al. in 2011 (basal-like 1 (BL1); basal-like 2 (BL2); mesenchymal (M); immunomodulatory (IM); mesenchymal stem-like (MSL); and luminal androgen receptor (LAR)), to be able to accurately predict them. METHODS: Lehman’s TNBCtype subtyping tool was applied to RNA-sequencing data from 482 TNBC (GSE164458), and a minimal subtype-specific gene signature was defined by combining two class comparison techniques with seven attribute selection methods. Several machine learning algorithms for subtype prediction were used, and the best classifier was applied on microarray data from 72 Italian TNBC and on the TNBC subset of the BRCA-TCGA data set. RESULTS: We identified two signatures with the 120 and 81 top up- and downregulated genes that define the six TNBC subtypes, with prediction accuracy ranging from 88.6 to 89.4%, and even improving after removal of the least important genes. Network analysis was used to identify highly interconnected genes within each subgroup. Two druggable matrix metalloproteinases were found in the BL1 and BL2 subsets, and several druggable targets were complementary to androgen receptor or aromatase in the LAR subset. Several secondary drug–target interactions were found among the upregulated genes in the M, IM and MSL subsets. CONCLUSIONS: Our study took full advantage of available TNBC data sets to stratify samples and genes into distinct subtypes, according to gene expression profiles. The development of a data mining approach to acquire a large amount of information from several data sets has allowed us to identify a well-determined minimal number of genes that may help in the recognition of TNBC subtypes. These genes, most of which have been previously found to be associated with breast cancer, have the potential to become novel diagnostic markers and/or therapeutic targets for specific TNBC subsets. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40246-022-00436-6. |
format | Online Article Text |
id | pubmed-9764480 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-97644802022-12-21 Identification of a minimum number of genes to predict triple-negative breast cancer subgroups from gene expression profiles Akhouayri, Laila Ostano, Paola Mello-Grand, Maurizia Gregnanin, Ilaria Crivelli, Francesca Laurora, Sara Liscia, Daniele Leone, Francesco Santoro, Angela Mulè, Antonino Guarino, Donatella Maggiore, Claudia Carlino, Angela Magno, Stefano Scatolini, Maria Di Leone, Alba Masetti, Riccardo Chiorino, Giovanna Hum Genomics Research BACKGROUND: Triple-negative breast cancer (TNBC) is a very heterogeneous disease. Several gene expression and mutation profiling approaches were used to classify it, and all converged to the identification of distinct molecular subtypes, with some overlapping across different approaches. However, a standardised tool to routinely classify TNBC in the clinics and guide personalised treatment is lacking. We aimed at defining a specific gene signature for each of the six TNBC subtypes proposed by Lehman et al. in 2011 (basal-like 1 (BL1); basal-like 2 (BL2); mesenchymal (M); immunomodulatory (IM); mesenchymal stem-like (MSL); and luminal androgen receptor (LAR)), to be able to accurately predict them. METHODS: Lehman’s TNBCtype subtyping tool was applied to RNA-sequencing data from 482 TNBC (GSE164458), and a minimal subtype-specific gene signature was defined by combining two class comparison techniques with seven attribute selection methods. Several machine learning algorithms for subtype prediction were used, and the best classifier was applied on microarray data from 72 Italian TNBC and on the TNBC subset of the BRCA-TCGA data set. RESULTS: We identified two signatures with the 120 and 81 top up- and downregulated genes that define the six TNBC subtypes, with prediction accuracy ranging from 88.6 to 89.4%, and even improving after removal of the least important genes. Network analysis was used to identify highly interconnected genes within each subgroup. Two druggable matrix metalloproteinases were found in the BL1 and BL2 subsets, and several druggable targets were complementary to androgen receptor or aromatase in the LAR subset. Several secondary drug–target interactions were found among the upregulated genes in the M, IM and MSL subsets. CONCLUSIONS: Our study took full advantage of available TNBC data sets to stratify samples and genes into distinct subtypes, according to gene expression profiles. The development of a data mining approach to acquire a large amount of information from several data sets has allowed us to identify a well-determined minimal number of genes that may help in the recognition of TNBC subtypes. These genes, most of which have been previously found to be associated with breast cancer, have the potential to become novel diagnostic markers and/or therapeutic targets for specific TNBC subsets. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40246-022-00436-6. BioMed Central 2022-12-20 /pmc/articles/PMC9764480/ /pubmed/36536459 http://dx.doi.org/10.1186/s40246-022-00436-6 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Akhouayri, Laila Ostano, Paola Mello-Grand, Maurizia Gregnanin, Ilaria Crivelli, Francesca Laurora, Sara Liscia, Daniele Leone, Francesco Santoro, Angela Mulè, Antonino Guarino, Donatella Maggiore, Claudia Carlino, Angela Magno, Stefano Scatolini, Maria Di Leone, Alba Masetti, Riccardo Chiorino, Giovanna Identification of a minimum number of genes to predict triple-negative breast cancer subgroups from gene expression profiles |
title | Identification of a minimum number of genes to predict triple-negative breast cancer subgroups from gene expression profiles |
title_full | Identification of a minimum number of genes to predict triple-negative breast cancer subgroups from gene expression profiles |
title_fullStr | Identification of a minimum number of genes to predict triple-negative breast cancer subgroups from gene expression profiles |
title_full_unstemmed | Identification of a minimum number of genes to predict triple-negative breast cancer subgroups from gene expression profiles |
title_short | Identification of a minimum number of genes to predict triple-negative breast cancer subgroups from gene expression profiles |
title_sort | identification of a minimum number of genes to predict triple-negative breast cancer subgroups from gene expression profiles |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9764480/ https://www.ncbi.nlm.nih.gov/pubmed/36536459 http://dx.doi.org/10.1186/s40246-022-00436-6 |
work_keys_str_mv | AT akhouayrilaila identificationofaminimumnumberofgenestopredicttriplenegativebreastcancersubgroupsfromgeneexpressionprofiles AT ostanopaola identificationofaminimumnumberofgenestopredicttriplenegativebreastcancersubgroupsfromgeneexpressionprofiles AT mellograndmaurizia identificationofaminimumnumberofgenestopredicttriplenegativebreastcancersubgroupsfromgeneexpressionprofiles AT gregnaninilaria identificationofaminimumnumberofgenestopredicttriplenegativebreastcancersubgroupsfromgeneexpressionprofiles AT crivellifrancesca identificationofaminimumnumberofgenestopredicttriplenegativebreastcancersubgroupsfromgeneexpressionprofiles AT laurorasara identificationofaminimumnumberofgenestopredicttriplenegativebreastcancersubgroupsfromgeneexpressionprofiles AT lisciadaniele identificationofaminimumnumberofgenestopredicttriplenegativebreastcancersubgroupsfromgeneexpressionprofiles AT leonefrancesco identificationofaminimumnumberofgenestopredicttriplenegativebreastcancersubgroupsfromgeneexpressionprofiles AT santoroangela identificationofaminimumnumberofgenestopredicttriplenegativebreastcancersubgroupsfromgeneexpressionprofiles AT muleantonino identificationofaminimumnumberofgenestopredicttriplenegativebreastcancersubgroupsfromgeneexpressionprofiles AT guarinodonatella identificationofaminimumnumberofgenestopredicttriplenegativebreastcancersubgroupsfromgeneexpressionprofiles AT maggioreclaudia identificationofaminimumnumberofgenestopredicttriplenegativebreastcancersubgroupsfromgeneexpressionprofiles AT carlinoangela identificationofaminimumnumberofgenestopredicttriplenegativebreastcancersubgroupsfromgeneexpressionprofiles AT magnostefano identificationofaminimumnumberofgenestopredicttriplenegativebreastcancersubgroupsfromgeneexpressionprofiles AT scatolinimaria identificationofaminimumnumberofgenestopredicttriplenegativebreastcancersubgroupsfromgeneexpressionprofiles AT dileonealba identificationofaminimumnumberofgenestopredicttriplenegativebreastcancersubgroupsfromgeneexpressionprofiles AT masettiriccardo identificationofaminimumnumberofgenestopredicttriplenegativebreastcancersubgroupsfromgeneexpressionprofiles AT chiorinogiovanna identificationofaminimumnumberofgenestopredicttriplenegativebreastcancersubgroupsfromgeneexpressionprofiles |