Cargando…

Evaluation of machine learning algorithms for the prognosis of breast cancer from the Surveillance, Epidemiology, and End Results database

INTRODUCTION: Many researchers used machine learning (ML) to predict the prognosis of breast cancer (BC) patients and noticed that the ML model had good individualized prediction performance. OBJECTIVE: The cohort study was intended to establish a reliable data analysis model by comparing the perfor...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Ruiyang, Luo, Jing, Wan, Hangyu, Zhang, Haiyan, Yuan, Yewei, Hu, Huihua, Feng, Jinyan, Wen, Jing, Wang, Yan, Li, Junyan, Liang, Qi, Gan, Fengjiao, Zhang, Gang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9879508/
https://www.ncbi.nlm.nih.gov/pubmed/36701415
http://dx.doi.org/10.1371/journal.pone.0280340
_version_ 1784878706032902144
author Wu, Ruiyang
Luo, Jing
Wan, Hangyu
Zhang, Haiyan
Yuan, Yewei
Hu, Huihua
Feng, Jinyan
Wen, Jing
Wang, Yan
Li, Junyan
Liang, Qi
Gan, Fengjiao
Zhang, Gang
author_facet Wu, Ruiyang
Luo, Jing
Wan, Hangyu
Zhang, Haiyan
Yuan, Yewei
Hu, Huihua
Feng, Jinyan
Wen, Jing
Wang, Yan
Li, Junyan
Liang, Qi
Gan, Fengjiao
Zhang, Gang
author_sort Wu, Ruiyang
collection PubMed
description INTRODUCTION: Many researchers used machine learning (ML) to predict the prognosis of breast cancer (BC) patients and noticed that the ML model had good individualized prediction performance. OBJECTIVE: The cohort study was intended to establish a reliable data analysis model by comparing the performance of 10 common ML algorithms and the the traditional American Joint Committee on Cancer (AJCC) stage, and used this model in Web application development to provide a good individualized prediction for others. METHODS: This study included 63145 BC patients from the Surveillance, Epidemiology, and End Results database. RESULTS: Through the performance of the 10 ML algorithms and 7th AJCC stage in the optimal test set, we found that in terms of 5-year overall survival, multivariate adaptive regression splines (MARS) had the highest area under the curve (AUC) value (0.831) and F1-score (0.608), and both sensitivity (0.737) and specificity (0.772) were relatively high. Besides, MARS showed a highest AUC value (0.831, 95%confidence interval: 0.820–0.842) in comparison to the other ML algorithms and 7th AJCC stage (all P < 0.05). MARS, the best performing model, was selected for web application development (https://w12251393.shinyapps.io/app2/). CONCLUSIONS: The comparative study of multiple forecasting models utilizing a large data noted that MARS based model achieved a much better performance compared to other ML algorithms and 7th AJCC stage in individualized estimation of survival of BC patients, which was very likely to be the next step towards precision medicine.
format Online
Article
Text
id pubmed-9879508
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-98795082023-01-27 Evaluation of machine learning algorithms for the prognosis of breast cancer from the Surveillance, Epidemiology, and End Results database Wu, Ruiyang Luo, Jing Wan, Hangyu Zhang, Haiyan Yuan, Yewei Hu, Huihua Feng, Jinyan Wen, Jing Wang, Yan Li, Junyan Liang, Qi Gan, Fengjiao Zhang, Gang PLoS One Research Article INTRODUCTION: Many researchers used machine learning (ML) to predict the prognosis of breast cancer (BC) patients and noticed that the ML model had good individualized prediction performance. OBJECTIVE: The cohort study was intended to establish a reliable data analysis model by comparing the performance of 10 common ML algorithms and the the traditional American Joint Committee on Cancer (AJCC) stage, and used this model in Web application development to provide a good individualized prediction for others. METHODS: This study included 63145 BC patients from the Surveillance, Epidemiology, and End Results database. RESULTS: Through the performance of the 10 ML algorithms and 7th AJCC stage in the optimal test set, we found that in terms of 5-year overall survival, multivariate adaptive regression splines (MARS) had the highest area under the curve (AUC) value (0.831) and F1-score (0.608), and both sensitivity (0.737) and specificity (0.772) were relatively high. Besides, MARS showed a highest AUC value (0.831, 95%confidence interval: 0.820–0.842) in comparison to the other ML algorithms and 7th AJCC stage (all P < 0.05). MARS, the best performing model, was selected for web application development (https://w12251393.shinyapps.io/app2/). CONCLUSIONS: The comparative study of multiple forecasting models utilizing a large data noted that MARS based model achieved a much better performance compared to other ML algorithms and 7th AJCC stage in individualized estimation of survival of BC patients, which was very likely to be the next step towards precision medicine. Public Library of Science 2023-01-26 /pmc/articles/PMC9879508/ /pubmed/36701415 http://dx.doi.org/10.1371/journal.pone.0280340 Text en © 2023 Wu et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Wu, Ruiyang
Luo, Jing
Wan, Hangyu
Zhang, Haiyan
Yuan, Yewei
Hu, Huihua
Feng, Jinyan
Wen, Jing
Wang, Yan
Li, Junyan
Liang, Qi
Gan, Fengjiao
Zhang, Gang
Evaluation of machine learning algorithms for the prognosis of breast cancer from the Surveillance, Epidemiology, and End Results database
title Evaluation of machine learning algorithms for the prognosis of breast cancer from the Surveillance, Epidemiology, and End Results database
title_full Evaluation of machine learning algorithms for the prognosis of breast cancer from the Surveillance, Epidemiology, and End Results database
title_fullStr Evaluation of machine learning algorithms for the prognosis of breast cancer from the Surveillance, Epidemiology, and End Results database
title_full_unstemmed Evaluation of machine learning algorithms for the prognosis of breast cancer from the Surveillance, Epidemiology, and End Results database
title_short Evaluation of machine learning algorithms for the prognosis of breast cancer from the Surveillance, Epidemiology, and End Results database
title_sort evaluation of machine learning algorithms for the prognosis of breast cancer from the surveillance, epidemiology, and end results database
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9879508/
https://www.ncbi.nlm.nih.gov/pubmed/36701415
http://dx.doi.org/10.1371/journal.pone.0280340
work_keys_str_mv AT wuruiyang evaluationofmachinelearningalgorithmsfortheprognosisofbreastcancerfromthesurveillanceepidemiologyandendresultsdatabase
AT luojing evaluationofmachinelearningalgorithmsfortheprognosisofbreastcancerfromthesurveillanceepidemiologyandendresultsdatabase
AT wanhangyu evaluationofmachinelearningalgorithmsfortheprognosisofbreastcancerfromthesurveillanceepidemiologyandendresultsdatabase
AT zhanghaiyan evaluationofmachinelearningalgorithmsfortheprognosisofbreastcancerfromthesurveillanceepidemiologyandendresultsdatabase
AT yuanyewei evaluationofmachinelearningalgorithmsfortheprognosisofbreastcancerfromthesurveillanceepidemiologyandendresultsdatabase
AT huhuihua evaluationofmachinelearningalgorithmsfortheprognosisofbreastcancerfromthesurveillanceepidemiologyandendresultsdatabase
AT fengjinyan evaluationofmachinelearningalgorithmsfortheprognosisofbreastcancerfromthesurveillanceepidemiologyandendresultsdatabase
AT wenjing evaluationofmachinelearningalgorithmsfortheprognosisofbreastcancerfromthesurveillanceepidemiologyandendresultsdatabase
AT wangyan evaluationofmachinelearningalgorithmsfortheprognosisofbreastcancerfromthesurveillanceepidemiologyandendresultsdatabase
AT lijunyan evaluationofmachinelearningalgorithmsfortheprognosisofbreastcancerfromthesurveillanceepidemiologyandendresultsdatabase
AT liangqi evaluationofmachinelearningalgorithmsfortheprognosisofbreastcancerfromthesurveillanceepidemiologyandendresultsdatabase
AT ganfengjiao evaluationofmachinelearningalgorithmsfortheprognosisofbreastcancerfromthesurveillanceepidemiologyandendresultsdatabase
AT zhanggang evaluationofmachinelearningalgorithmsfortheprognosisofbreastcancerfromthesurveillanceepidemiologyandendresultsdatabase