Cargando…

The innovative model based on artificial intelligence algorithms to predict recurrence risk of patients with postoperative breast cancer

PURPOSE: This study aimed to develop a machine learning model to retrospectively study and predict the recurrence risk of breast cancer patients after surgery by extracting the clinicopathological features of tumors from unstructured clinical electronic health record (EHR) data. METHODS: This retros...

Descripción completa

Detalles Bibliográficos
Autores principales: Zeng, Lixuan, Liu, Lei, Chen, Dongxin, Lu, Henghui, Xue, Yang, Bi, Hongjie, Yang, Weiwei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10029918/
https://www.ncbi.nlm.nih.gov/pubmed/36959794
http://dx.doi.org/10.3389/fonc.2023.1117420
_version_ 1784910243120021504
author Zeng, Lixuan
Liu, Lei
Chen, Dongxin
Lu, Henghui
Xue, Yang
Bi, Hongjie
Yang, Weiwei
author_facet Zeng, Lixuan
Liu, Lei
Chen, Dongxin
Lu, Henghui
Xue, Yang
Bi, Hongjie
Yang, Weiwei
author_sort Zeng, Lixuan
collection PubMed
description PURPOSE: This study aimed to develop a machine learning model to retrospectively study and predict the recurrence risk of breast cancer patients after surgery by extracting the clinicopathological features of tumors from unstructured clinical electronic health record (EHR) data. METHODS: This retrospective cohort included 1,841 breast cancer patients who underwent surgical treatment. To extract the principal features associated with recurrence risk, the clinical notes and histopathology reports of patients were collected and feature engineering was used. Predictive models were next conducted based on this important information. All algorithms were implemented using Python software. The accuracy of prediction models was further verified in the test cohort. The area under the curve (AUC), precision, recall, and F1 score were adopted to evaluate the performance of each model. RESULTS: A training cohort with 1,289 patients and a test cohort with 552 patients were recruited. From 2011 to 2019, a total of 1,841 textual reports were included. For the prediction of recurrence risk, both LSTM, XGBoost, and SVM had favorable accuracies of 0.89, 0.86, and 0.78. The AUC values of the micro-average ROC curve corresponding to LSTM, XGBoost, and SVM were 0.98 ± 0.01, 0.97 ± 0.03, and 0.92 ± 0.06. Especially the LSTM model achieved superior execution than other models. The accuracy, F1 score, macro-avg F1 score (0.87), and weighted-avg F1 score (0.89) of the LSTM model produced higher values. All P values were statistically significant. Patients in the high-risk group predicted by our model performed more resistant to DNA damage and microtubule targeting drugs than those in the intermediate-risk group. The predicted low-risk patients were not statistically significant compared with intermediate- or high-risk patients due to the small sample size (188 low-risk patients were predicted via our model, and only two of them were administered chemotherapy alone after surgery). The prognosis of patients predicted by our model was consistent with the actual follow-up records. CONCLUSIONS: The constructed model accurately predicted the recurrence risk of breast cancer patients from EHR data and certainly evaluated the chemoresistance and prognosis of patients. Therefore, our model can help clinicians to formulate the individualized management of breast cancer patients.
format Online
Article
Text
id pubmed-10029918
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-100299182023-03-22 The innovative model based on artificial intelligence algorithms to predict recurrence risk of patients with postoperative breast cancer Zeng, Lixuan Liu, Lei Chen, Dongxin Lu, Henghui Xue, Yang Bi, Hongjie Yang, Weiwei Front Oncol Oncology PURPOSE: This study aimed to develop a machine learning model to retrospectively study and predict the recurrence risk of breast cancer patients after surgery by extracting the clinicopathological features of tumors from unstructured clinical electronic health record (EHR) data. METHODS: This retrospective cohort included 1,841 breast cancer patients who underwent surgical treatment. To extract the principal features associated with recurrence risk, the clinical notes and histopathology reports of patients were collected and feature engineering was used. Predictive models were next conducted based on this important information. All algorithms were implemented using Python software. The accuracy of prediction models was further verified in the test cohort. The area under the curve (AUC), precision, recall, and F1 score were adopted to evaluate the performance of each model. RESULTS: A training cohort with 1,289 patients and a test cohort with 552 patients were recruited. From 2011 to 2019, a total of 1,841 textual reports were included. For the prediction of recurrence risk, both LSTM, XGBoost, and SVM had favorable accuracies of 0.89, 0.86, and 0.78. The AUC values of the micro-average ROC curve corresponding to LSTM, XGBoost, and SVM were 0.98 ± 0.01, 0.97 ± 0.03, and 0.92 ± 0.06. Especially the LSTM model achieved superior execution than other models. The accuracy, F1 score, macro-avg F1 score (0.87), and weighted-avg F1 score (0.89) of the LSTM model produced higher values. All P values were statistically significant. Patients in the high-risk group predicted by our model performed more resistant to DNA damage and microtubule targeting drugs than those in the intermediate-risk group. The predicted low-risk patients were not statistically significant compared with intermediate- or high-risk patients due to the small sample size (188 low-risk patients were predicted via our model, and only two of them were administered chemotherapy alone after surgery). The prognosis of patients predicted by our model was consistent with the actual follow-up records. CONCLUSIONS: The constructed model accurately predicted the recurrence risk of breast cancer patients from EHR data and certainly evaluated the chemoresistance and prognosis of patients. Therefore, our model can help clinicians to formulate the individualized management of breast cancer patients. Frontiers Media S.A. 2023-03-07 /pmc/articles/PMC10029918/ /pubmed/36959794 http://dx.doi.org/10.3389/fonc.2023.1117420 Text en Copyright © 2023 Zeng, Liu, Chen, Lu, Xue, Bi and Yang https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Oncology
Zeng, Lixuan
Liu, Lei
Chen, Dongxin
Lu, Henghui
Xue, Yang
Bi, Hongjie
Yang, Weiwei
The innovative model based on artificial intelligence algorithms to predict recurrence risk of patients with postoperative breast cancer
title The innovative model based on artificial intelligence algorithms to predict recurrence risk of patients with postoperative breast cancer
title_full The innovative model based on artificial intelligence algorithms to predict recurrence risk of patients with postoperative breast cancer
title_fullStr The innovative model based on artificial intelligence algorithms to predict recurrence risk of patients with postoperative breast cancer
title_full_unstemmed The innovative model based on artificial intelligence algorithms to predict recurrence risk of patients with postoperative breast cancer
title_short The innovative model based on artificial intelligence algorithms to predict recurrence risk of patients with postoperative breast cancer
title_sort innovative model based on artificial intelligence algorithms to predict recurrence risk of patients with postoperative breast cancer
topic Oncology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10029918/
https://www.ncbi.nlm.nih.gov/pubmed/36959794
http://dx.doi.org/10.3389/fonc.2023.1117420
work_keys_str_mv AT zenglixuan theinnovativemodelbasedonartificialintelligencealgorithmstopredictrecurrenceriskofpatientswithpostoperativebreastcancer
AT liulei theinnovativemodelbasedonartificialintelligencealgorithmstopredictrecurrenceriskofpatientswithpostoperativebreastcancer
AT chendongxin theinnovativemodelbasedonartificialintelligencealgorithmstopredictrecurrenceriskofpatientswithpostoperativebreastcancer
AT luhenghui theinnovativemodelbasedonartificialintelligencealgorithmstopredictrecurrenceriskofpatientswithpostoperativebreastcancer
AT xueyang theinnovativemodelbasedonartificialintelligencealgorithmstopredictrecurrenceriskofpatientswithpostoperativebreastcancer
AT bihongjie theinnovativemodelbasedonartificialintelligencealgorithmstopredictrecurrenceriskofpatientswithpostoperativebreastcancer
AT yangweiwei theinnovativemodelbasedonartificialintelligencealgorithmstopredictrecurrenceriskofpatientswithpostoperativebreastcancer
AT zenglixuan innovativemodelbasedonartificialintelligencealgorithmstopredictrecurrenceriskofpatientswithpostoperativebreastcancer
AT liulei innovativemodelbasedonartificialintelligencealgorithmstopredictrecurrenceriskofpatientswithpostoperativebreastcancer
AT chendongxin innovativemodelbasedonartificialintelligencealgorithmstopredictrecurrenceriskofpatientswithpostoperativebreastcancer
AT luhenghui innovativemodelbasedonartificialintelligencealgorithmstopredictrecurrenceriskofpatientswithpostoperativebreastcancer
AT xueyang innovativemodelbasedonartificialintelligencealgorithmstopredictrecurrenceriskofpatientswithpostoperativebreastcancer
AT bihongjie innovativemodelbasedonartificialintelligencealgorithmstopredictrecurrenceriskofpatientswithpostoperativebreastcancer
AT yangweiwei innovativemodelbasedonartificialintelligencealgorithmstopredictrecurrenceriskofpatientswithpostoperativebreastcancer