Cargando…

Development of Gene Expression-Based Random Forest Model for Predicting Neoadjuvant Chemotherapy Response in Triple-Negative Breast Cancer

SIMPLE SUMMARY: Only 20–50% of patients with triple negative breast cancer achieve a pathological complete response from neoadjuvant chemotherapy, a strong indicator of patient survival. Therefore, there is an urgent need for a reliable predictive model of the patient’s pathological complete respons...

Descripción completa

Detalles Bibliográficos
Autores principales: Park, Seongyong, Yi, Gwansu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8870575/
https://www.ncbi.nlm.nih.gov/pubmed/35205629
http://dx.doi.org/10.3390/cancers14040881
_version_ 1784656790569353216
author Park, Seongyong
Yi, Gwansu
author_facet Park, Seongyong
Yi, Gwansu
author_sort Park, Seongyong
collection PubMed
description SIMPLE SUMMARY: Only 20–50% of patients with triple negative breast cancer achieve a pathological complete response from neoadjuvant chemotherapy, a strong indicator of patient survival. Therefore, there is an urgent need for a reliable predictive model of the patient’s pathological complete response prior to actual treatment. The purpose of this study was to develop such a model based on random forest recursive feature elimination and to benchmark the performance of the proposed model against existing predictive models. Our study suggests that an 86-gene-based random forest model associated to DNA repair and cell cycle mechanisms can provide reliable predictions of neoadjuvant chemotherapy response in patients with triple negative breast cancer. ABSTRACT: Neoadjuvant chemotherapy (NAC) response is an important indicator of patient survival in triple negative breast cancer (TNBC), but predicting chemosensitivity remains a challenge in clinical practice. We developed an 86-gene-based random forest (RF) classifier capable of predicting neoadjuvant chemotherapy response (pathological Complete Response (pCR) or Residual Disease (RD)) in TNBC patients. The performance of pCR classification of the proposed model was evaluated by Receiver Operating Characteristic (ROC) curve and Precision Recall (PR) curve. The AUROC and AUPRC of the proposed model on the test set were 0.891 and 0.829, respectively. At a predefined specificity (>90%), the proposed model shows a superior sensitivity compared to the best performing reported NAC response prediction model (69.2% vs. 36.9%). Moreover, the predicted pCR status by the model well explains the distance recurrence free survival (DRFS) of TNBC patients. In addition, the pCR probabilities of the proposed model using the expression profiles of the CCLE TNBC cell lines show a high Spearman rank correlation with cyclophosphamide sensitivity in the TNBC cell lines (SRCC [Formula: see text] , p-value [Formula: see text]). Associations between the 86 genes and DNA repair/cell cycle mechanisms were provided through function enrichment analysis. Our study suggests that the random forest-based prediction model provides a reliable prediction of the clinical response to neoadjuvant chemotherapy and may explain chemosensitivity in TNBC.
format Online
Article
Text
id pubmed-8870575
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-88705752022-02-25 Development of Gene Expression-Based Random Forest Model for Predicting Neoadjuvant Chemotherapy Response in Triple-Negative Breast Cancer Park, Seongyong Yi, Gwansu Cancers (Basel) Article SIMPLE SUMMARY: Only 20–50% of patients with triple negative breast cancer achieve a pathological complete response from neoadjuvant chemotherapy, a strong indicator of patient survival. Therefore, there is an urgent need for a reliable predictive model of the patient’s pathological complete response prior to actual treatment. The purpose of this study was to develop such a model based on random forest recursive feature elimination and to benchmark the performance of the proposed model against existing predictive models. Our study suggests that an 86-gene-based random forest model associated to DNA repair and cell cycle mechanisms can provide reliable predictions of neoadjuvant chemotherapy response in patients with triple negative breast cancer. ABSTRACT: Neoadjuvant chemotherapy (NAC) response is an important indicator of patient survival in triple negative breast cancer (TNBC), but predicting chemosensitivity remains a challenge in clinical practice. We developed an 86-gene-based random forest (RF) classifier capable of predicting neoadjuvant chemotherapy response (pathological Complete Response (pCR) or Residual Disease (RD)) in TNBC patients. The performance of pCR classification of the proposed model was evaluated by Receiver Operating Characteristic (ROC) curve and Precision Recall (PR) curve. The AUROC and AUPRC of the proposed model on the test set were 0.891 and 0.829, respectively. At a predefined specificity (>90%), the proposed model shows a superior sensitivity compared to the best performing reported NAC response prediction model (69.2% vs. 36.9%). Moreover, the predicted pCR status by the model well explains the distance recurrence free survival (DRFS) of TNBC patients. In addition, the pCR probabilities of the proposed model using the expression profiles of the CCLE TNBC cell lines show a high Spearman rank correlation with cyclophosphamide sensitivity in the TNBC cell lines (SRCC [Formula: see text] , p-value [Formula: see text]). Associations between the 86 genes and DNA repair/cell cycle mechanisms were provided through function enrichment analysis. Our study suggests that the random forest-based prediction model provides a reliable prediction of the clinical response to neoadjuvant chemotherapy and may explain chemosensitivity in TNBC. MDPI 2022-02-10 /pmc/articles/PMC8870575/ /pubmed/35205629 http://dx.doi.org/10.3390/cancers14040881 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Park, Seongyong
Yi, Gwansu
Development of Gene Expression-Based Random Forest Model for Predicting Neoadjuvant Chemotherapy Response in Triple-Negative Breast Cancer
title Development of Gene Expression-Based Random Forest Model for Predicting Neoadjuvant Chemotherapy Response in Triple-Negative Breast Cancer
title_full Development of Gene Expression-Based Random Forest Model for Predicting Neoadjuvant Chemotherapy Response in Triple-Negative Breast Cancer
title_fullStr Development of Gene Expression-Based Random Forest Model for Predicting Neoadjuvant Chemotherapy Response in Triple-Negative Breast Cancer
title_full_unstemmed Development of Gene Expression-Based Random Forest Model for Predicting Neoadjuvant Chemotherapy Response in Triple-Negative Breast Cancer
title_short Development of Gene Expression-Based Random Forest Model for Predicting Neoadjuvant Chemotherapy Response in Triple-Negative Breast Cancer
title_sort development of gene expression-based random forest model for predicting neoadjuvant chemotherapy response in triple-negative breast cancer
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8870575/
https://www.ncbi.nlm.nih.gov/pubmed/35205629
http://dx.doi.org/10.3390/cancers14040881
work_keys_str_mv AT parkseongyong developmentofgeneexpressionbasedrandomforestmodelforpredictingneoadjuvantchemotherapyresponseintriplenegativebreastcancer
AT yigwansu developmentofgeneexpressionbasedrandomforestmodelforpredictingneoadjuvantchemotherapyresponseintriplenegativebreastcancer