Cargando…

Model Comparison for Breast Cancer Prognosis Based on Clinical Data

We compared the performance of several prediction techniques for breast cancer prognosis, based on AU-ROC performance (Area Under ROC) for different prognosis periods. The analyzed dataset contained 1,981 patients and from an initial 25 variables, the 11 most common clinical predictors were retained...

Descripción completa

Detalles Bibliográficos
Autores principales: Boughorbel, Sabri, Al-Ali, Rashid, Elkum, Naser
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4714871/
https://www.ncbi.nlm.nih.gov/pubmed/26771838
http://dx.doi.org/10.1371/journal.pone.0146413
_version_ 1782410380316770304
author Boughorbel, Sabri
Al-Ali, Rashid
Elkum, Naser
author_facet Boughorbel, Sabri
Al-Ali, Rashid
Elkum, Naser
author_sort Boughorbel, Sabri
collection PubMed
description We compared the performance of several prediction techniques for breast cancer prognosis, based on AU-ROC performance (Area Under ROC) for different prognosis periods. The analyzed dataset contained 1,981 patients and from an initial 25 variables, the 11 most common clinical predictors were retained. We compared eight models from a wide spectrum of predictive models, namely; Generalized Linear Model (GLM), GLM-Net, Partial Least Square (PLS), Support Vector Machines (SVM), Random Forests (RF), Neural Networks, k-Nearest Neighbors (k-NN) and Boosted Trees. In order to compare these models, paired t-test was applied on the model performance differences obtained from data resampling. Random Forests, Boosted Trees, Partial Least Square and GLMNet have superior overall performance, however they are only slightly higher than the other models. The comparative analysis also allowed us to define a relative variable importance as the average of variable importance from the different models. Two sets of variables are identified from this analysis. The first includes number of positive lymph nodes, tumor size, cancer grade and estrogen receptor, all has an important influence on model predictability. The second set incudes variables related to histological parameters and treatment types. The short term vs long term contribution of the clinical variables are also analyzed from the comparative models. From the various cancer treatment plans, the combination of Chemo/Radio therapy leads to the largest impact on cancer prognosis.
format Online
Article
Text
id pubmed-4714871
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-47148712016-01-30 Model Comparison for Breast Cancer Prognosis Based on Clinical Data Boughorbel, Sabri Al-Ali, Rashid Elkum, Naser PLoS One Research Article We compared the performance of several prediction techniques for breast cancer prognosis, based on AU-ROC performance (Area Under ROC) for different prognosis periods. The analyzed dataset contained 1,981 patients and from an initial 25 variables, the 11 most common clinical predictors were retained. We compared eight models from a wide spectrum of predictive models, namely; Generalized Linear Model (GLM), GLM-Net, Partial Least Square (PLS), Support Vector Machines (SVM), Random Forests (RF), Neural Networks, k-Nearest Neighbors (k-NN) and Boosted Trees. In order to compare these models, paired t-test was applied on the model performance differences obtained from data resampling. Random Forests, Boosted Trees, Partial Least Square and GLMNet have superior overall performance, however they are only slightly higher than the other models. The comparative analysis also allowed us to define a relative variable importance as the average of variable importance from the different models. Two sets of variables are identified from this analysis. The first includes number of positive lymph nodes, tumor size, cancer grade and estrogen receptor, all has an important influence on model predictability. The second set incudes variables related to histological parameters and treatment types. The short term vs long term contribution of the clinical variables are also analyzed from the comparative models. From the various cancer treatment plans, the combination of Chemo/Radio therapy leads to the largest impact on cancer prognosis. Public Library of Science 2016-01-15 /pmc/articles/PMC4714871/ /pubmed/26771838 http://dx.doi.org/10.1371/journal.pone.0146413 Text en © 2016 Boughorbel et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Boughorbel, Sabri
Al-Ali, Rashid
Elkum, Naser
Model Comparison for Breast Cancer Prognosis Based on Clinical Data
title Model Comparison for Breast Cancer Prognosis Based on Clinical Data
title_full Model Comparison for Breast Cancer Prognosis Based on Clinical Data
title_fullStr Model Comparison for Breast Cancer Prognosis Based on Clinical Data
title_full_unstemmed Model Comparison for Breast Cancer Prognosis Based on Clinical Data
title_short Model Comparison for Breast Cancer Prognosis Based on Clinical Data
title_sort model comparison for breast cancer prognosis based on clinical data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4714871/
https://www.ncbi.nlm.nih.gov/pubmed/26771838
http://dx.doi.org/10.1371/journal.pone.0146413
work_keys_str_mv AT boughorbelsabri modelcomparisonforbreastcancerprognosisbasedonclinicaldata
AT alalirashid modelcomparisonforbreastcancerprognosisbasedonclinicaldata
AT elkumnaser modelcomparisonforbreastcancerprognosisbasedonclinicaldata