Cargando…

Predicting breast cancer 5-year survival using machine learning: A systematic review

BACKGROUND: Accurately predicting the survival rate of breast cancer patients is a major issue for cancer researchers. Machine learning (ML) has attracted much attention with the hope that it could provide accurate results, but its modeling methods and prediction performance remain controversial. Th...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Jiaxin, Zhou, Zijun, Dong, Jianyu, Fu, Ying, Li, Yuan, Luan, Ze, Peng, Xin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8051758/
https://www.ncbi.nlm.nih.gov/pubmed/33861809
http://dx.doi.org/10.1371/journal.pone.0250370
_version_ 1783679793140596736
author Li, Jiaxin
Zhou, Zijun
Dong, Jianyu
Fu, Ying
Li, Yuan
Luan, Ze
Peng, Xin
author_facet Li, Jiaxin
Zhou, Zijun
Dong, Jianyu
Fu, Ying
Li, Yuan
Luan, Ze
Peng, Xin
author_sort Li, Jiaxin
collection PubMed
description BACKGROUND: Accurately predicting the survival rate of breast cancer patients is a major issue for cancer researchers. Machine learning (ML) has attracted much attention with the hope that it could provide accurate results, but its modeling methods and prediction performance remain controversial. The aim of this systematic review is to identify and critically appraise current studies regarding the application of ML in predicting the 5-year survival rate of breast cancer. METHODS: In accordance with the PRISMA guidelines, two researchers independently searched the PubMed (including MEDLINE), Embase, and Web of Science Core databases from inception to November 30, 2020. The search terms included breast neoplasms, survival, machine learning, and specific algorithm names. The included studies related to the use of ML to build a breast cancer survival prediction model and model performance that can be measured with the value of said verification results. The excluded studies in which the modeling process were not explained clearly and had incomplete information. The extracted information included literature information, database information, data preparation and modeling process information, model construction and performance evaluation information, and candidate predictor information. RESULTS: Thirty-one studies that met the inclusion criteria were included, most of which were published after 2013. The most frequently used ML methods were decision trees (19 studies, 61.3%), artificial neural networks (18 studies, 58.1%), support vector machines (16 studies, 51.6%), and ensemble learning (10 studies, 32.3%). The median sample size was 37256 (range 200 to 659820) patients, and the median predictor was 16 (range 3 to 625). The accuracy of 29 studies ranged from 0.510 to 0.971. The sensitivity of 25 studies ranged from 0.037 to 1. The specificity of 24 studies ranged from 0.008 to 0.993. The AUC of 20 studies ranged from 0.500 to 0.972. The precision of 6 studies ranged from 0.549 to 1. All of the models were internally validated, and only one was externally validated. CONCLUSIONS: Overall, compared with traditional statistical methods, the performance of ML models does not necessarily show any improvement, and this area of research still faces limitations related to a lack of data preprocessing steps, the excessive differences of sample feature selection, and issues related to validation. Further optimization of the performance of the proposed model is also needed in the future, which requires more standardization and subsequent validation.
format Online
Article
Text
id pubmed-8051758
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-80517582021-04-28 Predicting breast cancer 5-year survival using machine learning: A systematic review Li, Jiaxin Zhou, Zijun Dong, Jianyu Fu, Ying Li, Yuan Luan, Ze Peng, Xin PLoS One Research Article BACKGROUND: Accurately predicting the survival rate of breast cancer patients is a major issue for cancer researchers. Machine learning (ML) has attracted much attention with the hope that it could provide accurate results, but its modeling methods and prediction performance remain controversial. The aim of this systematic review is to identify and critically appraise current studies regarding the application of ML in predicting the 5-year survival rate of breast cancer. METHODS: In accordance with the PRISMA guidelines, two researchers independently searched the PubMed (including MEDLINE), Embase, and Web of Science Core databases from inception to November 30, 2020. The search terms included breast neoplasms, survival, machine learning, and specific algorithm names. The included studies related to the use of ML to build a breast cancer survival prediction model and model performance that can be measured with the value of said verification results. The excluded studies in which the modeling process were not explained clearly and had incomplete information. The extracted information included literature information, database information, data preparation and modeling process information, model construction and performance evaluation information, and candidate predictor information. RESULTS: Thirty-one studies that met the inclusion criteria were included, most of which were published after 2013. The most frequently used ML methods were decision trees (19 studies, 61.3%), artificial neural networks (18 studies, 58.1%), support vector machines (16 studies, 51.6%), and ensemble learning (10 studies, 32.3%). The median sample size was 37256 (range 200 to 659820) patients, and the median predictor was 16 (range 3 to 625). The accuracy of 29 studies ranged from 0.510 to 0.971. The sensitivity of 25 studies ranged from 0.037 to 1. The specificity of 24 studies ranged from 0.008 to 0.993. The AUC of 20 studies ranged from 0.500 to 0.972. The precision of 6 studies ranged from 0.549 to 1. All of the models were internally validated, and only one was externally validated. CONCLUSIONS: Overall, compared with traditional statistical methods, the performance of ML models does not necessarily show any improvement, and this area of research still faces limitations related to a lack of data preprocessing steps, the excessive differences of sample feature selection, and issues related to validation. Further optimization of the performance of the proposed model is also needed in the future, which requires more standardization and subsequent validation. Public Library of Science 2021-04-16 /pmc/articles/PMC8051758/ /pubmed/33861809 http://dx.doi.org/10.1371/journal.pone.0250370 Text en © 2021 Li et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Li, Jiaxin
Zhou, Zijun
Dong, Jianyu
Fu, Ying
Li, Yuan
Luan, Ze
Peng, Xin
Predicting breast cancer 5-year survival using machine learning: A systematic review
title Predicting breast cancer 5-year survival using machine learning: A systematic review
title_full Predicting breast cancer 5-year survival using machine learning: A systematic review
title_fullStr Predicting breast cancer 5-year survival using machine learning: A systematic review
title_full_unstemmed Predicting breast cancer 5-year survival using machine learning: A systematic review
title_short Predicting breast cancer 5-year survival using machine learning: A systematic review
title_sort predicting breast cancer 5-year survival using machine learning: a systematic review
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8051758/
https://www.ncbi.nlm.nih.gov/pubmed/33861809
http://dx.doi.org/10.1371/journal.pone.0250370
work_keys_str_mv AT lijiaxin predictingbreastcancer5yearsurvivalusingmachinelearningasystematicreview
AT zhouzijun predictingbreastcancer5yearsurvivalusingmachinelearningasystematicreview
AT dongjianyu predictingbreastcancer5yearsurvivalusingmachinelearningasystematicreview
AT fuying predictingbreastcancer5yearsurvivalusingmachinelearningasystematicreview
AT liyuan predictingbreastcancer5yearsurvivalusingmachinelearningasystematicreview
AT luanze predictingbreastcancer5yearsurvivalusingmachinelearningasystematicreview
AT pengxin predictingbreastcancer5yearsurvivalusingmachinelearningasystematicreview