Cargando…

Optimizing prognostic factors of five-year survival in gastric cancer patients using feature selection techniques with machine learning algorithms: a comparative study

BACKGROUND: Gastric cancer is the most common malignant tumor worldwide and a leading cause of cancer deaths. This neoplasm has a poor prognosis and heterogeneous outcomes. Survivability prediction may help select the best treatment plan based on an individual’s prognosis. Numerous clinical and path...

Descripción completa

Detalles Bibliográficos
Autores principales: Afrash, Mohammad Reza, Mirbagheri, Esmat, Mashoufi, Mehrnaz, Kazemi-Arpanahi, Hadi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10080884/
https://www.ncbi.nlm.nih.gov/pubmed/37024885
http://dx.doi.org/10.1186/s12911-023-02154-y
_version_ 1785021008347922432
author Afrash, Mohammad Reza
Mirbagheri, Esmat
Mashoufi, Mehrnaz
Kazemi-Arpanahi, Hadi
author_facet Afrash, Mohammad Reza
Mirbagheri, Esmat
Mashoufi, Mehrnaz
Kazemi-Arpanahi, Hadi
author_sort Afrash, Mohammad Reza
collection PubMed
description BACKGROUND: Gastric cancer is the most common malignant tumor worldwide and a leading cause of cancer deaths. This neoplasm has a poor prognosis and heterogeneous outcomes. Survivability prediction may help select the best treatment plan based on an individual’s prognosis. Numerous clinical and pathological features are generally used in predicting gastric cancer survival, and their influence on the survival of this cancer has not been fully elucidated. Moreover, the five-year survivability prognosis performances of feature selection methods with machine learning (ML) classifiers for gastric cancer have not been fully benchmarked. Therefore, we adopted several well-known feature selection methods and ML classifiers together to determine the best-paired feature selection-classifier for this purpose. METHODS: This was a retrospective study on a dataset of 974 patients diagnosed with gastric cancer in the Ayatollah Talleghani Hospital, Abadan, Iran. First, four feature selection algorithms, including Relief, Boruta, least absolute shrinkage and selection operator (LASSO), and minimum redundancy maximum relevance (mRMR) were used to select a set of relevant features that are very informative for five-year survival prediction in gastric cancer patients. Then, each feature set was fed to three classifiers: XG Boost (XGB), hist gradient boosting (HGB), and support vector machine (SVM) to develop predictive models. Finally, paired feature selection-classifier methods were evaluated to select the best-paired method using the area under the curve (AUC), accuracy, sensitivity, specificity, and f1-score metrics. RESULTS: The LASSO feature selection algorithm combined with the XG Boost classifier achieved an accuracy of 89.10%, a specificity of 87.15%, a sensitivity of 89.42%, an AUC of 89.37%, and an f1-score of 90.8%. Tumor stage, history of other cancers, lymphatic invasion, tumor site, type of treatment, body weight, histological type, and addiction were identified as the most significant factors affecting gastric cancer survival. CONCLUSIONS: This study proved the worth of the paired feature selection-classifier to identify the best path that could augment the five-year survival prediction in gastric cancer patients. Our results were better than those of previous studies, both in terms of the time required to form the models and the performance measurement criteria of the algorithms. These findings may be very promising and can, therefore, inform clinical decision-making and shed light on future studies.
format Online
Article
Text
id pubmed-10080884
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-100808842023-04-08 Optimizing prognostic factors of five-year survival in gastric cancer patients using feature selection techniques with machine learning algorithms: a comparative study Afrash, Mohammad Reza Mirbagheri, Esmat Mashoufi, Mehrnaz Kazemi-Arpanahi, Hadi BMC Med Inform Decis Mak Research BACKGROUND: Gastric cancer is the most common malignant tumor worldwide and a leading cause of cancer deaths. This neoplasm has a poor prognosis and heterogeneous outcomes. Survivability prediction may help select the best treatment plan based on an individual’s prognosis. Numerous clinical and pathological features are generally used in predicting gastric cancer survival, and their influence on the survival of this cancer has not been fully elucidated. Moreover, the five-year survivability prognosis performances of feature selection methods with machine learning (ML) classifiers for gastric cancer have not been fully benchmarked. Therefore, we adopted several well-known feature selection methods and ML classifiers together to determine the best-paired feature selection-classifier for this purpose. METHODS: This was a retrospective study on a dataset of 974 patients diagnosed with gastric cancer in the Ayatollah Talleghani Hospital, Abadan, Iran. First, four feature selection algorithms, including Relief, Boruta, least absolute shrinkage and selection operator (LASSO), and minimum redundancy maximum relevance (mRMR) were used to select a set of relevant features that are very informative for five-year survival prediction in gastric cancer patients. Then, each feature set was fed to three classifiers: XG Boost (XGB), hist gradient boosting (HGB), and support vector machine (SVM) to develop predictive models. Finally, paired feature selection-classifier methods were evaluated to select the best-paired method using the area under the curve (AUC), accuracy, sensitivity, specificity, and f1-score metrics. RESULTS: The LASSO feature selection algorithm combined with the XG Boost classifier achieved an accuracy of 89.10%, a specificity of 87.15%, a sensitivity of 89.42%, an AUC of 89.37%, and an f1-score of 90.8%. Tumor stage, history of other cancers, lymphatic invasion, tumor site, type of treatment, body weight, histological type, and addiction were identified as the most significant factors affecting gastric cancer survival. CONCLUSIONS: This study proved the worth of the paired feature selection-classifier to identify the best path that could augment the five-year survival prediction in gastric cancer patients. Our results were better than those of previous studies, both in terms of the time required to form the models and the performance measurement criteria of the algorithms. These findings may be very promising and can, therefore, inform clinical decision-making and shed light on future studies. BioMed Central 2023-04-06 /pmc/articles/PMC10080884/ /pubmed/37024885 http://dx.doi.org/10.1186/s12911-023-02154-y Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Afrash, Mohammad Reza
Mirbagheri, Esmat
Mashoufi, Mehrnaz
Kazemi-Arpanahi, Hadi
Optimizing prognostic factors of five-year survival in gastric cancer patients using feature selection techniques with machine learning algorithms: a comparative study
title Optimizing prognostic factors of five-year survival in gastric cancer patients using feature selection techniques with machine learning algorithms: a comparative study
title_full Optimizing prognostic factors of five-year survival in gastric cancer patients using feature selection techniques with machine learning algorithms: a comparative study
title_fullStr Optimizing prognostic factors of five-year survival in gastric cancer patients using feature selection techniques with machine learning algorithms: a comparative study
title_full_unstemmed Optimizing prognostic factors of five-year survival in gastric cancer patients using feature selection techniques with machine learning algorithms: a comparative study
title_short Optimizing prognostic factors of five-year survival in gastric cancer patients using feature selection techniques with machine learning algorithms: a comparative study
title_sort optimizing prognostic factors of five-year survival in gastric cancer patients using feature selection techniques with machine learning algorithms: a comparative study
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10080884/
https://www.ncbi.nlm.nih.gov/pubmed/37024885
http://dx.doi.org/10.1186/s12911-023-02154-y
work_keys_str_mv AT afrashmohammadreza optimizingprognosticfactorsoffiveyearsurvivalingastriccancerpatientsusingfeatureselectiontechniqueswithmachinelearningalgorithmsacomparativestudy
AT mirbagheriesmat optimizingprognosticfactorsoffiveyearsurvivalingastriccancerpatientsusingfeatureselectiontechniqueswithmachinelearningalgorithmsacomparativestudy
AT mashoufimehrnaz optimizingprognosticfactorsoffiveyearsurvivalingastriccancerpatientsusingfeatureselectiontechniqueswithmachinelearningalgorithmsacomparativestudy
AT kazemiarpanahihadi optimizingprognosticfactorsoffiveyearsurvivalingastriccancerpatientsusingfeatureselectiontechniqueswithmachinelearningalgorithmsacomparativestudy