Cargando…

Developing the breast cancer risk prediction system using hybrid machine learning algorithms

BACKGROUND: Breast cancer (BC) is the most common cause of cancer-related deaths in women globally. Currently, many machine learning (ML)-based predictive models have been established to assist clinicians in decision making for the prediction of BC. However, preventing risk factor formation even wit...

Descripción completa

Detalles Bibliográficos
Autores principales: Afrash, Mohammad R., Bayani, Azadeh, Shanbehzadeh, Mostafa, Bahadori, Mohammadkarim, Kazemi-Arpanahi, Hadi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Wolters Kluwer - Medknow 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9621357/
https://www.ncbi.nlm.nih.gov/pubmed/36325225
http://dx.doi.org/10.4103/jehp.jehp_42_22
_version_ 1784821523643629568
author Afrash, Mohammad R.
Bayani, Azadeh
Shanbehzadeh, Mostafa
Bahadori, Mohammadkarim
Kazemi-Arpanahi, Hadi
author_facet Afrash, Mohammad R.
Bayani, Azadeh
Shanbehzadeh, Mostafa
Bahadori, Mohammadkarim
Kazemi-Arpanahi, Hadi
author_sort Afrash, Mohammad R.
collection PubMed
description BACKGROUND: Breast cancer (BC) is the most common cause of cancer-related deaths in women globally. Currently, many machine learning (ML)-based predictive models have been established to assist clinicians in decision making for the prediction of BC. However, preventing risk factor formation even with having healthy lifestyle behaviors or preventing disease at early stages can significantly lead to optimal population-wide BC health. Thus, we aimed to develop a prediction model by using a genetic algorithm (GA) incorporating several ML algorithms for the prediction and early warning of BC. MATERIAL AND METHODS: The data of 3168 healthy individuals and 1742 patient case records in the BC Registry Database in Ayatollah Taleghani hospital, Abadan, Iran were analyzed. First, a modified hybrid GA was used to perform feature selection and optimization of selected features. Then, with the use of selected features, several ML algorithms were trained to predict BC. Afterward, the performance of each model was measured in terms of accuracy, precision, sensitivity, specificity, and receiver operating characteristic (ROC) curve metrics. Finally, a clinical decision support system based on the best model was developed. RESULTS: After performing feature selection, age, consumption of dairy products, BC family history, breast biopsy, chest X-ray, hormone therapy, alcohol consumption, being overweight, having children, and education statuses were selected as the most important features for prediction of BC. The experimental results showed that the decision tree yielded a superior performance than other ML models, with values of 99.3%, 99.5%, 98.26% for accuracy, specificity, and sensitivity, respectively. CONCLUSION: The developed predictive system can accurately identify persons who are at elevated risk for BC and can be used as an essential clinical screening tool for the early prevention of BC and serve as an important tool for developing preventive health strategies.
format Online
Article
Text
id pubmed-9621357
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Wolters Kluwer - Medknow
record_format MEDLINE/PubMed
spelling pubmed-96213572022-11-01 Developing the breast cancer risk prediction system using hybrid machine learning algorithms Afrash, Mohammad R. Bayani, Azadeh Shanbehzadeh, Mostafa Bahadori, Mohammadkarim Kazemi-Arpanahi, Hadi J Educ Health Promot Original Article BACKGROUND: Breast cancer (BC) is the most common cause of cancer-related deaths in women globally. Currently, many machine learning (ML)-based predictive models have been established to assist clinicians in decision making for the prediction of BC. However, preventing risk factor formation even with having healthy lifestyle behaviors or preventing disease at early stages can significantly lead to optimal population-wide BC health. Thus, we aimed to develop a prediction model by using a genetic algorithm (GA) incorporating several ML algorithms for the prediction and early warning of BC. MATERIAL AND METHODS: The data of 3168 healthy individuals and 1742 patient case records in the BC Registry Database in Ayatollah Taleghani hospital, Abadan, Iran were analyzed. First, a modified hybrid GA was used to perform feature selection and optimization of selected features. Then, with the use of selected features, several ML algorithms were trained to predict BC. Afterward, the performance of each model was measured in terms of accuracy, precision, sensitivity, specificity, and receiver operating characteristic (ROC) curve metrics. Finally, a clinical decision support system based on the best model was developed. RESULTS: After performing feature selection, age, consumption of dairy products, BC family history, breast biopsy, chest X-ray, hormone therapy, alcohol consumption, being overweight, having children, and education statuses were selected as the most important features for prediction of BC. The experimental results showed that the decision tree yielded a superior performance than other ML models, with values of 99.3%, 99.5%, 98.26% for accuracy, specificity, and sensitivity, respectively. CONCLUSION: The developed predictive system can accurately identify persons who are at elevated risk for BC and can be used as an essential clinical screening tool for the early prevention of BC and serve as an important tool for developing preventive health strategies. Wolters Kluwer - Medknow 2022-08-25 /pmc/articles/PMC9621357/ /pubmed/36325225 http://dx.doi.org/10.4103/jehp.jehp_42_22 Text en Copyright: © 2022 Journal of Education and Health Promotion https://creativecommons.org/licenses/by-nc-sa/4.0/This is an open access journal, and articles are distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 License, which allows others to remix, tweak, and build upon the work non-commercially, as long as appropriate credit is given and the new creations are licensed under the identical terms.
spellingShingle Original Article
Afrash, Mohammad R.
Bayani, Azadeh
Shanbehzadeh, Mostafa
Bahadori, Mohammadkarim
Kazemi-Arpanahi, Hadi
Developing the breast cancer risk prediction system using hybrid machine learning algorithms
title Developing the breast cancer risk prediction system using hybrid machine learning algorithms
title_full Developing the breast cancer risk prediction system using hybrid machine learning algorithms
title_fullStr Developing the breast cancer risk prediction system using hybrid machine learning algorithms
title_full_unstemmed Developing the breast cancer risk prediction system using hybrid machine learning algorithms
title_short Developing the breast cancer risk prediction system using hybrid machine learning algorithms
title_sort developing the breast cancer risk prediction system using hybrid machine learning algorithms
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9621357/
https://www.ncbi.nlm.nih.gov/pubmed/36325225
http://dx.doi.org/10.4103/jehp.jehp_42_22
work_keys_str_mv AT afrashmohammadr developingthebreastcancerriskpredictionsystemusinghybridmachinelearningalgorithms
AT bayaniazadeh developingthebreastcancerriskpredictionsystemusinghybridmachinelearningalgorithms
AT shanbehzadehmostafa developingthebreastcancerriskpredictionsystemusinghybridmachinelearningalgorithms
AT bahadorimohammadkarim developingthebreastcancerriskpredictionsystemusinghybridmachinelearningalgorithms
AT kazemiarpanahihadi developingthebreastcancerriskpredictionsystemusinghybridmachinelearningalgorithms