Cargando…

Which Is a More Accurate Predictor in Colorectal Survival Analysis? Nine Data Mining Algorithms vs. the TNM Staging System

OBJECTIVE: Over the past decades, many studies have used data mining technology to predict the 5-year survival rate of colorectal cancer, but there have been few reports that compared multiple data mining algorithms to the TNM classification of malignant tumors (TNM) staging system using a dataset i...

Descripción completa

Detalles Bibliográficos
Autores principales: Gao, Peng, Zhou, Xin, Wang, Zhen-ning, Song, Yong-xi, Tong, Lin-lin, Xu, Ying-ying, Yue, Zhen-yu, Xu, Hui-mian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3404978/
https://www.ncbi.nlm.nih.gov/pubmed/22848691
http://dx.doi.org/10.1371/journal.pone.0042015
_version_ 1782239055634759680
author Gao, Peng
Zhou, Xin
Wang, Zhen-ning
Song, Yong-xi
Tong, Lin-lin
Xu, Ying-ying
Yue, Zhen-yu
Xu, Hui-mian
author_facet Gao, Peng
Zhou, Xin
Wang, Zhen-ning
Song, Yong-xi
Tong, Lin-lin
Xu, Ying-ying
Yue, Zhen-yu
Xu, Hui-mian
author_sort Gao, Peng
collection PubMed
description OBJECTIVE: Over the past decades, many studies have used data mining technology to predict the 5-year survival rate of colorectal cancer, but there have been few reports that compared multiple data mining algorithms to the TNM classification of malignant tumors (TNM) staging system using a dataset in which the training and testing data were from different sources. Here we compared nine data mining algorithms to the TNM staging system for colorectal survival analysis. METHODS: Two different datasets were used: 1) the National Cancer Institute's Surveillance, Epidemiology, and End Results dataset; and 2) the dataset from a single Chinese institution. An optimization and prediction system based on nine data mining algorithms as well as two variable selection methods was implemented. The TNM staging system was based on the 7(th) edition of the American Joint Committee on Cancer TNM staging system. RESULTS: When the training and testing data were from the same sources, all algorithms had slight advantages over the TNM staging system in predictive accuracy. When the data were from different sources, only four algorithms (logistic regression, general regression neural network, Bayesian networks, and Naïve Bayes) had slight advantages over the TNM staging system. Also, there was no significant differences among all the algorithms (p>0.05). CONCLUSIONS: The TNM staging system is simple and practical at present, and data mining methods are not accurate enough to replace the TNM staging system for colorectal cancer survival prediction. Furthermore, there were no significant differences in the predictive accuracy of all the algorithms when the data were from different sources. Building a larger dataset that includes more variables may be important for furthering predictive accuracy.
format Online
Article
Text
id pubmed-3404978
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-34049782012-07-30 Which Is a More Accurate Predictor in Colorectal Survival Analysis? Nine Data Mining Algorithms vs. the TNM Staging System Gao, Peng Zhou, Xin Wang, Zhen-ning Song, Yong-xi Tong, Lin-lin Xu, Ying-ying Yue, Zhen-yu Xu, Hui-mian PLoS One Research Article OBJECTIVE: Over the past decades, many studies have used data mining technology to predict the 5-year survival rate of colorectal cancer, but there have been few reports that compared multiple data mining algorithms to the TNM classification of malignant tumors (TNM) staging system using a dataset in which the training and testing data were from different sources. Here we compared nine data mining algorithms to the TNM staging system for colorectal survival analysis. METHODS: Two different datasets were used: 1) the National Cancer Institute's Surveillance, Epidemiology, and End Results dataset; and 2) the dataset from a single Chinese institution. An optimization and prediction system based on nine data mining algorithms as well as two variable selection methods was implemented. The TNM staging system was based on the 7(th) edition of the American Joint Committee on Cancer TNM staging system. RESULTS: When the training and testing data were from the same sources, all algorithms had slight advantages over the TNM staging system in predictive accuracy. When the data were from different sources, only four algorithms (logistic regression, general regression neural network, Bayesian networks, and Naïve Bayes) had slight advantages over the TNM staging system. Also, there was no significant differences among all the algorithms (p>0.05). CONCLUSIONS: The TNM staging system is simple and practical at present, and data mining methods are not accurate enough to replace the TNM staging system for colorectal cancer survival prediction. Furthermore, there were no significant differences in the predictive accuracy of all the algorithms when the data were from different sources. Building a larger dataset that includes more variables may be important for furthering predictive accuracy. Public Library of Science 2012-07-25 /pmc/articles/PMC3404978/ /pubmed/22848691 http://dx.doi.org/10.1371/journal.pone.0042015 Text en Gao et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Gao, Peng
Zhou, Xin
Wang, Zhen-ning
Song, Yong-xi
Tong, Lin-lin
Xu, Ying-ying
Yue, Zhen-yu
Xu, Hui-mian
Which Is a More Accurate Predictor in Colorectal Survival Analysis? Nine Data Mining Algorithms vs. the TNM Staging System
title Which Is a More Accurate Predictor in Colorectal Survival Analysis? Nine Data Mining Algorithms vs. the TNM Staging System
title_full Which Is a More Accurate Predictor in Colorectal Survival Analysis? Nine Data Mining Algorithms vs. the TNM Staging System
title_fullStr Which Is a More Accurate Predictor in Colorectal Survival Analysis? Nine Data Mining Algorithms vs. the TNM Staging System
title_full_unstemmed Which Is a More Accurate Predictor in Colorectal Survival Analysis? Nine Data Mining Algorithms vs. the TNM Staging System
title_short Which Is a More Accurate Predictor in Colorectal Survival Analysis? Nine Data Mining Algorithms vs. the TNM Staging System
title_sort which is a more accurate predictor in colorectal survival analysis? nine data mining algorithms vs. the tnm staging system
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3404978/
https://www.ncbi.nlm.nih.gov/pubmed/22848691
http://dx.doi.org/10.1371/journal.pone.0042015
work_keys_str_mv AT gaopeng whichisamoreaccuratepredictorincolorectalsurvivalanalysisninedataminingalgorithmsvsthetnmstagingsystem
AT zhouxin whichisamoreaccuratepredictorincolorectalsurvivalanalysisninedataminingalgorithmsvsthetnmstagingsystem
AT wangzhenning whichisamoreaccuratepredictorincolorectalsurvivalanalysisninedataminingalgorithmsvsthetnmstagingsystem
AT songyongxi whichisamoreaccuratepredictorincolorectalsurvivalanalysisninedataminingalgorithmsvsthetnmstagingsystem
AT tonglinlin whichisamoreaccuratepredictorincolorectalsurvivalanalysisninedataminingalgorithmsvsthetnmstagingsystem
AT xuyingying whichisamoreaccuratepredictorincolorectalsurvivalanalysisninedataminingalgorithmsvsthetnmstagingsystem
AT yuezhenyu whichisamoreaccuratepredictorincolorectalsurvivalanalysisninedataminingalgorithmsvsthetnmstagingsystem
AT xuhuimian whichisamoreaccuratepredictorincolorectalsurvivalanalysisninedataminingalgorithmsvsthetnmstagingsystem