Cargando…

Comparison of Basic and Ensemble Data Mining Methods in Predicting 5-Year Survival of Colorectal Cancer Patients

INTRODUCTION: Colorectal cancer (CRC) is one of the most common malignancies and cause of cancer mortality worldwide. Given the importance of predicting the survival of CRC patients and the growing use of data mining methods, this study aims to compare the performance of models for predicting 5-year...

Descripción completa

Detalles Bibliográficos
Autores principales: Pourhoseingholi, Mohamad Amin, Kheirian, Sedigheh, Zali, Mohammad Reza
Formato: Online Artículo Texto
Lenguaje:English
Publicado: AVICENA, d.o.o., Sarajevo 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5723205/
https://www.ncbi.nlm.nih.gov/pubmed/29284916
http://dx.doi.org/10.5455/aim.2017.25.254-258
_version_ 1783285170934120448
author Pourhoseingholi, Mohamad Amin
Kheirian, Sedigheh
Zali, Mohammad Reza
author_facet Pourhoseingholi, Mohamad Amin
Kheirian, Sedigheh
Zali, Mohammad Reza
author_sort Pourhoseingholi, Mohamad Amin
collection PubMed
description INTRODUCTION: Colorectal cancer (CRC) is one of the most common malignancies and cause of cancer mortality worldwide. Given the importance of predicting the survival of CRC patients and the growing use of data mining methods, this study aims to compare the performance of models for predicting 5-year survival of CRC patients using variety of basic and ensemble data mining methods. METHODS: The CRC dataset from The Shahid Beheshti University of Medical Sciences Research Center for Gastroenterology and Liver Diseases were used for prediction and comparative study of the base and ensemble data mining techniques. Feature selection methods were used to select predictor attributes for classification. The WEKA toolkit and MedCalc software were respectively utilized for creating and comparing the models. RESULTS: The obtained results showed that the predictive performance of developed models was altogether high (all greater than 90%). Overall, the performance of ensemble models was higher than that of basic classifiers and the best result achieved by ensemble voting model in terms of area under the ROC curve (AUC= 0.96). CONCLUSION: AUC Comparison of models showed that the ensemble voting method significantly outperformed all models except for two methods of Random Forest (RF) and Bayesian Network (BN) considered the overlapping 95% confidence intervals. This result may indicate high predictive power of these two methods along with ensemble voting for predicting 5-year survival of CRC patients.
format Online
Article
Text
id pubmed-5723205
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher AVICENA, d.o.o., Sarajevo
record_format MEDLINE/PubMed
spelling pubmed-57232052017-12-28 Comparison of Basic and Ensemble Data Mining Methods in Predicting 5-Year Survival of Colorectal Cancer Patients Pourhoseingholi, Mohamad Amin Kheirian, Sedigheh Zali, Mohammad Reza Acta Inform Med Original Paper INTRODUCTION: Colorectal cancer (CRC) is one of the most common malignancies and cause of cancer mortality worldwide. Given the importance of predicting the survival of CRC patients and the growing use of data mining methods, this study aims to compare the performance of models for predicting 5-year survival of CRC patients using variety of basic and ensemble data mining methods. METHODS: The CRC dataset from The Shahid Beheshti University of Medical Sciences Research Center for Gastroenterology and Liver Diseases were used for prediction and comparative study of the base and ensemble data mining techniques. Feature selection methods were used to select predictor attributes for classification. The WEKA toolkit and MedCalc software were respectively utilized for creating and comparing the models. RESULTS: The obtained results showed that the predictive performance of developed models was altogether high (all greater than 90%). Overall, the performance of ensemble models was higher than that of basic classifiers and the best result achieved by ensemble voting model in terms of area under the ROC curve (AUC= 0.96). CONCLUSION: AUC Comparison of models showed that the ensemble voting method significantly outperformed all models except for two methods of Random Forest (RF) and Bayesian Network (BN) considered the overlapping 95% confidence intervals. This result may indicate high predictive power of these two methods along with ensemble voting for predicting 5-year survival of CRC patients. AVICENA, d.o.o., Sarajevo 2017-12 /pmc/articles/PMC5723205/ /pubmed/29284916 http://dx.doi.org/10.5455/aim.2017.25.254-258 Text en Copyright: © 2017 Mohamad Amin Pourhoseingholi, Sedigheh Kheirian, Mohammad Reza Zali http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Pourhoseingholi, Mohamad Amin
Kheirian, Sedigheh
Zali, Mohammad Reza
Comparison of Basic and Ensemble Data Mining Methods in Predicting 5-Year Survival of Colorectal Cancer Patients
title Comparison of Basic and Ensemble Data Mining Methods in Predicting 5-Year Survival of Colorectal Cancer Patients
title_full Comparison of Basic and Ensemble Data Mining Methods in Predicting 5-Year Survival of Colorectal Cancer Patients
title_fullStr Comparison of Basic and Ensemble Data Mining Methods in Predicting 5-Year Survival of Colorectal Cancer Patients
title_full_unstemmed Comparison of Basic and Ensemble Data Mining Methods in Predicting 5-Year Survival of Colorectal Cancer Patients
title_short Comparison of Basic and Ensemble Data Mining Methods in Predicting 5-Year Survival of Colorectal Cancer Patients
title_sort comparison of basic and ensemble data mining methods in predicting 5-year survival of colorectal cancer patients
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5723205/
https://www.ncbi.nlm.nih.gov/pubmed/29284916
http://dx.doi.org/10.5455/aim.2017.25.254-258
work_keys_str_mv AT pourhoseingholimohamadamin comparisonofbasicandensembledataminingmethodsinpredicting5yearsurvivalofcolorectalcancerpatients
AT kheiriansedigheh comparisonofbasicandensembledataminingmethodsinpredicting5yearsurvivalofcolorectalcancerpatients
AT zalimohammadreza comparisonofbasicandensembledataminingmethodsinpredicting5yearsurvivalofcolorectalcancerpatients