Cargando…
Prediction of Breast Cancer Survival by Machine Learning Methods: An Application of Multiple Imputation
BACKGROUND: The low breast cancer survival rates in less developed countries are critical. The machine learning techniques predict cancers survival with high accuracy. Missing data are the most important limitation for using the highest potential of these techniques to predict cancers survival. Mult...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Tehran University of Medical Sciences
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8214598/ https://www.ncbi.nlm.nih.gov/pubmed/34178808 http://dx.doi.org/10.18502/ijph.v50i3.5606 |
_version_ | 1783710098290376704 |
---|---|
author | LOTFNEZHAD AFSHAR, Hadi JABBARI, Nasrollah KHALKHALI, Hamid Reza ESNAASHARI, Omid |
author_facet | LOTFNEZHAD AFSHAR, Hadi JABBARI, Nasrollah KHALKHALI, Hamid Reza ESNAASHARI, Omid |
author_sort | LOTFNEZHAD AFSHAR, Hadi |
collection | PubMed |
description | BACKGROUND: The low breast cancer survival rates in less developed countries are critical. The machine learning techniques predict cancers survival with high accuracy. Missing data are the most important limitation for using the highest potential of these techniques to predict cancers survival. Multiple imputation (MI) was implemented and analyzed in detail to impute the missing data of a breast cancer dataset. METHODS: The dataset was from The Omid Treatment and Research Center Urmia, Iran between Jan 2006 and Dec 2012 and had information from 856 women. The algorithms such as C5 and repeated incremental pruning to produce error reduction were applied on the imputed versions of the original dataset and the non-imputed dataset to predict and extract clinical rules, respectively. RESULTS: The findings showed the performance of C5 in all the evaluation criteria including accuracy (84.42%), sensitivity (92.21%), specificity (64%), Kappa statistic (59.06%), and the area under the receiver operator characteristic (ROC) curve (0.84), was improved after imputation. CONCLUSION: The dataset of the present study met the requirements for using the multiple imputation method. The extracted rules after the application of MI were more comprehensive and contained knowledge that is more clinical. However, the clinical value of the extracted rules after filling in the missing data did not noticeably increase. |
format | Online Article Text |
id | pubmed-8214598 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Tehran University of Medical Sciences |
record_format | MEDLINE/PubMed |
spelling | pubmed-82145982021-06-25 Prediction of Breast Cancer Survival by Machine Learning Methods: An Application of Multiple Imputation LOTFNEZHAD AFSHAR, Hadi JABBARI, Nasrollah KHALKHALI, Hamid Reza ESNAASHARI, Omid Iran J Public Health Original Article BACKGROUND: The low breast cancer survival rates in less developed countries are critical. The machine learning techniques predict cancers survival with high accuracy. Missing data are the most important limitation for using the highest potential of these techniques to predict cancers survival. Multiple imputation (MI) was implemented and analyzed in detail to impute the missing data of a breast cancer dataset. METHODS: The dataset was from The Omid Treatment and Research Center Urmia, Iran between Jan 2006 and Dec 2012 and had information from 856 women. The algorithms such as C5 and repeated incremental pruning to produce error reduction were applied on the imputed versions of the original dataset and the non-imputed dataset to predict and extract clinical rules, respectively. RESULTS: The findings showed the performance of C5 in all the evaluation criteria including accuracy (84.42%), sensitivity (92.21%), specificity (64%), Kappa statistic (59.06%), and the area under the receiver operator characteristic (ROC) curve (0.84), was improved after imputation. CONCLUSION: The dataset of the present study met the requirements for using the multiple imputation method. The extracted rules after the application of MI were more comprehensive and contained knowledge that is more clinical. However, the clinical value of the extracted rules after filling in the missing data did not noticeably increase. Tehran University of Medical Sciences 2021-03 /pmc/articles/PMC8214598/ /pubmed/34178808 http://dx.doi.org/10.18502/ijph.v50i3.5606 Text en Copyright © 2021 Lotfnezhad Afshar et al. Published by Tehran University of Medical Sciences https://creativecommons.org/licenses/by-nc/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International license (https://creativecommons.org/licenses/by-nc/4.0/). Non-commercial uses of the work are permitted, provided the original work is properly cited. |
spellingShingle | Original Article LOTFNEZHAD AFSHAR, Hadi JABBARI, Nasrollah KHALKHALI, Hamid Reza ESNAASHARI, Omid Prediction of Breast Cancer Survival by Machine Learning Methods: An Application of Multiple Imputation |
title | Prediction of Breast Cancer Survival by Machine Learning Methods: An Application of Multiple Imputation |
title_full | Prediction of Breast Cancer Survival by Machine Learning Methods: An Application of Multiple Imputation |
title_fullStr | Prediction of Breast Cancer Survival by Machine Learning Methods: An Application of Multiple Imputation |
title_full_unstemmed | Prediction of Breast Cancer Survival by Machine Learning Methods: An Application of Multiple Imputation |
title_short | Prediction of Breast Cancer Survival by Machine Learning Methods: An Application of Multiple Imputation |
title_sort | prediction of breast cancer survival by machine learning methods: an application of multiple imputation |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8214598/ https://www.ncbi.nlm.nih.gov/pubmed/34178808 http://dx.doi.org/10.18502/ijph.v50i3.5606 |
work_keys_str_mv | AT lotfnezhadafsharhadi predictionofbreastcancersurvivalbymachinelearningmethodsanapplicationofmultipleimputation AT jabbarinasrollah predictionofbreastcancersurvivalbymachinelearningmethodsanapplicationofmultipleimputation AT khalkhalihamidreza predictionofbreastcancersurvivalbymachinelearningmethodsanapplicationofmultipleimputation AT esnaashariomid predictionofbreastcancersurvivalbymachinelearningmethodsanapplicationofmultipleimputation |