Cargando…

Development and testing of a random forest-based machine learning model for predicting events among breast cancer patients with a poor response to neoadjuvant chemotherapy

BACKGROUND: Breast cancer (BC) is the most common malignant tumor around the world. Timely detection of the tumor progression after treatment could improve the survival outcome of patients. This study aimed to develop machine learning models to predict events (defined as either (1) the first tumor r...

Descripción completa

Detalles Bibliográficos
Autores principales: Jin, Yudi, Lan, Ailin, Dai, Yuran, Jiang, Linshan, Liu, Shengchun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10543332/
https://www.ncbi.nlm.nih.gov/pubmed/37777809
http://dx.doi.org/10.1186/s40001-023-01361-7
_version_ 1785114278857015296
author Jin, Yudi
Lan, Ailin
Dai, Yuran
Jiang, Linshan
Liu, Shengchun
author_facet Jin, Yudi
Lan, Ailin
Dai, Yuran
Jiang, Linshan
Liu, Shengchun
author_sort Jin, Yudi
collection PubMed
description BACKGROUND: Breast cancer (BC) is the most common malignant tumor around the world. Timely detection of the tumor progression after treatment could improve the survival outcome of patients. This study aimed to develop machine learning models to predict events (defined as either (1) the first tumor relapse locally, regionally, or distantly; (2) a diagnosis of secondary malignant tumor; or (3) death because of any reason.) in BC patients post-treatment. METHODS: The patients with the response of stable disease (SD) and progressive disease (PD) after neoadjuvant chemotherapy (NAC) were selected. The clinicopathological features and the survival data were recorded in 1 year and 5 years, respectively. Patients were randomly divided into the training set and test set in the ratio of 8:2. A random forest (RF) and a logistic regression were established in both of 1-year cohort and the 5-year cohort. The performance was compared between the two models. The models were validated using data from the Surveillance, Epidemiology, and End Results (SEER) database. RESULTS: A total of 315 patients were included. In the 1-year cohort, 197 patients were divided into a training set while 87 were into a test set. The specificity, sensitivity, and AUC were 0.800, 0.833, and 0.810 in the RF model. And 0.520, 0.833, and 0.653 of the logistic regression. In the 5-year cohort, 132 patients were divided into the training set while 33 were into the test set. The specificity, sensitivity, and AUC were 0.882, 0.750, and 0.829 in the RF model. And 0.882, 0.688, and 0.752 of the logistic regression. In the external validation set, of the RF model, the specificity, sensitivity, and AUC were 0.765, 0.812, and 0.779. Of the logistics regression model, the specificity, sensitivity, and AUC were 0.833, 0.376, and 0.619. CONCLUSION: The RF model has a good performance in predicting events among BC patients with SD and PD post-NAC. It may be beneficial to BC patients, assisting in detecting tumor recurrence. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40001-023-01361-7.
format Online
Article
Text
id pubmed-10543332
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-105433322023-10-03 Development and testing of a random forest-based machine learning model for predicting events among breast cancer patients with a poor response to neoadjuvant chemotherapy Jin, Yudi Lan, Ailin Dai, Yuran Jiang, Linshan Liu, Shengchun Eur J Med Res Research BACKGROUND: Breast cancer (BC) is the most common malignant tumor around the world. Timely detection of the tumor progression after treatment could improve the survival outcome of patients. This study aimed to develop machine learning models to predict events (defined as either (1) the first tumor relapse locally, regionally, or distantly; (2) a diagnosis of secondary malignant tumor; or (3) death because of any reason.) in BC patients post-treatment. METHODS: The patients with the response of stable disease (SD) and progressive disease (PD) after neoadjuvant chemotherapy (NAC) were selected. The clinicopathological features and the survival data were recorded in 1 year and 5 years, respectively. Patients were randomly divided into the training set and test set in the ratio of 8:2. A random forest (RF) and a logistic regression were established in both of 1-year cohort and the 5-year cohort. The performance was compared between the two models. The models were validated using data from the Surveillance, Epidemiology, and End Results (SEER) database. RESULTS: A total of 315 patients were included. In the 1-year cohort, 197 patients were divided into a training set while 87 were into a test set. The specificity, sensitivity, and AUC were 0.800, 0.833, and 0.810 in the RF model. And 0.520, 0.833, and 0.653 of the logistic regression. In the 5-year cohort, 132 patients were divided into the training set while 33 were into the test set. The specificity, sensitivity, and AUC were 0.882, 0.750, and 0.829 in the RF model. And 0.882, 0.688, and 0.752 of the logistic regression. In the external validation set, of the RF model, the specificity, sensitivity, and AUC were 0.765, 0.812, and 0.779. Of the logistics regression model, the specificity, sensitivity, and AUC were 0.833, 0.376, and 0.619. CONCLUSION: The RF model has a good performance in predicting events among BC patients with SD and PD post-NAC. It may be beneficial to BC patients, assisting in detecting tumor recurrence. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40001-023-01361-7. BioMed Central 2023-09-30 /pmc/articles/PMC10543332/ /pubmed/37777809 http://dx.doi.org/10.1186/s40001-023-01361-7 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Jin, Yudi
Lan, Ailin
Dai, Yuran
Jiang, Linshan
Liu, Shengchun
Development and testing of a random forest-based machine learning model for predicting events among breast cancer patients with a poor response to neoadjuvant chemotherapy
title Development and testing of a random forest-based machine learning model for predicting events among breast cancer patients with a poor response to neoadjuvant chemotherapy
title_full Development and testing of a random forest-based machine learning model for predicting events among breast cancer patients with a poor response to neoadjuvant chemotherapy
title_fullStr Development and testing of a random forest-based machine learning model for predicting events among breast cancer patients with a poor response to neoadjuvant chemotherapy
title_full_unstemmed Development and testing of a random forest-based machine learning model for predicting events among breast cancer patients with a poor response to neoadjuvant chemotherapy
title_short Development and testing of a random forest-based machine learning model for predicting events among breast cancer patients with a poor response to neoadjuvant chemotherapy
title_sort development and testing of a random forest-based machine learning model for predicting events among breast cancer patients with a poor response to neoadjuvant chemotherapy
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10543332/
https://www.ncbi.nlm.nih.gov/pubmed/37777809
http://dx.doi.org/10.1186/s40001-023-01361-7
work_keys_str_mv AT jinyudi developmentandtestingofarandomforestbasedmachinelearningmodelforpredictingeventsamongbreastcancerpatientswithapoorresponsetoneoadjuvantchemotherapy
AT lanailin developmentandtestingofarandomforestbasedmachinelearningmodelforpredictingeventsamongbreastcancerpatientswithapoorresponsetoneoadjuvantchemotherapy
AT daiyuran developmentandtestingofarandomforestbasedmachinelearningmodelforpredictingeventsamongbreastcancerpatientswithapoorresponsetoneoadjuvantchemotherapy
AT jianglinshan developmentandtestingofarandomforestbasedmachinelearningmodelforpredictingeventsamongbreastcancerpatientswithapoorresponsetoneoadjuvantchemotherapy
AT liushengchun developmentandtestingofarandomforestbasedmachinelearningmodelforpredictingeventsamongbreastcancerpatientswithapoorresponsetoneoadjuvantchemotherapy