Cargando…
Comparison of Basic and Ensemble Data Mining Methods in Predicting 5-Year Survival of Colorectal Cancer Patients
INTRODUCTION: Colorectal cancer (CRC) is one of the most common malignancies and cause of cancer mortality worldwide. Given the importance of predicting the survival of CRC patients and the growing use of data mining methods, this study aims to compare the performance of models for predicting 5-year...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
AVICENA, d.o.o., Sarajevo
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5723205/ https://www.ncbi.nlm.nih.gov/pubmed/29284916 http://dx.doi.org/10.5455/aim.2017.25.254-258 |
_version_ | 1783285170934120448 |
---|---|
author | Pourhoseingholi, Mohamad Amin Kheirian, Sedigheh Zali, Mohammad Reza |
author_facet | Pourhoseingholi, Mohamad Amin Kheirian, Sedigheh Zali, Mohammad Reza |
author_sort | Pourhoseingholi, Mohamad Amin |
collection | PubMed |
description | INTRODUCTION: Colorectal cancer (CRC) is one of the most common malignancies and cause of cancer mortality worldwide. Given the importance of predicting the survival of CRC patients and the growing use of data mining methods, this study aims to compare the performance of models for predicting 5-year survival of CRC patients using variety of basic and ensemble data mining methods. METHODS: The CRC dataset from The Shahid Beheshti University of Medical Sciences Research Center for Gastroenterology and Liver Diseases were used for prediction and comparative study of the base and ensemble data mining techniques. Feature selection methods were used to select predictor attributes for classification. The WEKA toolkit and MedCalc software were respectively utilized for creating and comparing the models. RESULTS: The obtained results showed that the predictive performance of developed models was altogether high (all greater than 90%). Overall, the performance of ensemble models was higher than that of basic classifiers and the best result achieved by ensemble voting model in terms of area under the ROC curve (AUC= 0.96). CONCLUSION: AUC Comparison of models showed that the ensemble voting method significantly outperformed all models except for two methods of Random Forest (RF) and Bayesian Network (BN) considered the overlapping 95% confidence intervals. This result may indicate high predictive power of these two methods along with ensemble voting for predicting 5-year survival of CRC patients. |
format | Online Article Text |
id | pubmed-5723205 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | AVICENA, d.o.o., Sarajevo |
record_format | MEDLINE/PubMed |
spelling | pubmed-57232052017-12-28 Comparison of Basic and Ensemble Data Mining Methods in Predicting 5-Year Survival of Colorectal Cancer Patients Pourhoseingholi, Mohamad Amin Kheirian, Sedigheh Zali, Mohammad Reza Acta Inform Med Original Paper INTRODUCTION: Colorectal cancer (CRC) is one of the most common malignancies and cause of cancer mortality worldwide. Given the importance of predicting the survival of CRC patients and the growing use of data mining methods, this study aims to compare the performance of models for predicting 5-year survival of CRC patients using variety of basic and ensemble data mining methods. METHODS: The CRC dataset from The Shahid Beheshti University of Medical Sciences Research Center for Gastroenterology and Liver Diseases were used for prediction and comparative study of the base and ensemble data mining techniques. Feature selection methods were used to select predictor attributes for classification. The WEKA toolkit and MedCalc software were respectively utilized for creating and comparing the models. RESULTS: The obtained results showed that the predictive performance of developed models was altogether high (all greater than 90%). Overall, the performance of ensemble models was higher than that of basic classifiers and the best result achieved by ensemble voting model in terms of area under the ROC curve (AUC= 0.96). CONCLUSION: AUC Comparison of models showed that the ensemble voting method significantly outperformed all models except for two methods of Random Forest (RF) and Bayesian Network (BN) considered the overlapping 95% confidence intervals. This result may indicate high predictive power of these two methods along with ensemble voting for predicting 5-year survival of CRC patients. AVICENA, d.o.o., Sarajevo 2017-12 /pmc/articles/PMC5723205/ /pubmed/29284916 http://dx.doi.org/10.5455/aim.2017.25.254-258 Text en Copyright: © 2017 Mohamad Amin Pourhoseingholi, Sedigheh Kheirian, Mohammad Reza Zali http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Paper Pourhoseingholi, Mohamad Amin Kheirian, Sedigheh Zali, Mohammad Reza Comparison of Basic and Ensemble Data Mining Methods in Predicting 5-Year Survival of Colorectal Cancer Patients |
title | Comparison of Basic and Ensemble Data Mining Methods in Predicting 5-Year Survival of Colorectal Cancer Patients |
title_full | Comparison of Basic and Ensemble Data Mining Methods in Predicting 5-Year Survival of Colorectal Cancer Patients |
title_fullStr | Comparison of Basic and Ensemble Data Mining Methods in Predicting 5-Year Survival of Colorectal Cancer Patients |
title_full_unstemmed | Comparison of Basic and Ensemble Data Mining Methods in Predicting 5-Year Survival of Colorectal Cancer Patients |
title_short | Comparison of Basic and Ensemble Data Mining Methods in Predicting 5-Year Survival of Colorectal Cancer Patients |
title_sort | comparison of basic and ensemble data mining methods in predicting 5-year survival of colorectal cancer patients |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5723205/ https://www.ncbi.nlm.nih.gov/pubmed/29284916 http://dx.doi.org/10.5455/aim.2017.25.254-258 |
work_keys_str_mv | AT pourhoseingholimohamadamin comparisonofbasicandensembledataminingmethodsinpredicting5yearsurvivalofcolorectalcancerpatients AT kheiriansedigheh comparisonofbasicandensembledataminingmethodsinpredicting5yearsurvivalofcolorectalcancerpatients AT zalimohammadreza comparisonofbasicandensembledataminingmethodsinpredicting5yearsurvivalofcolorectalcancerpatients |