Cargando…

Statistical learning methods as a preprocessing step for survival analysis: evaluation of concept using lung cancer data

BACKGROUND: Statistical learning (SL) techniques can address non-linear relationships and small datasets but do not provide an output that has an epidemiologic interpretation. METHODS: A small set of clinical variables (CVs) for stage-1 non-small cell lung cancer patients was used to evaluate an app...

Descripción completa

Detalles Bibliográficos
Autores principales: Behera, Madhusmita, Fowler, Erin E, Owonikoko, Taofeek K, Land, Walker H, Mayfield, William, Chen, Zhengjia, Khuri, Fadlo R, Ramalingam, Suresh S, Heine, John J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3280940/
https://www.ncbi.nlm.nih.gov/pubmed/22067671
http://dx.doi.org/10.1186/1475-925X-10-97
_version_ 1782223884451315712
author Behera, Madhusmita
Fowler, Erin E
Owonikoko, Taofeek K
Land, Walker H
Mayfield, William
Chen, Zhengjia
Khuri, Fadlo R
Ramalingam, Suresh S
Heine, John J
author_facet Behera, Madhusmita
Fowler, Erin E
Owonikoko, Taofeek K
Land, Walker H
Mayfield, William
Chen, Zhengjia
Khuri, Fadlo R
Ramalingam, Suresh S
Heine, John J
author_sort Behera, Madhusmita
collection PubMed
description BACKGROUND: Statistical learning (SL) techniques can address non-linear relationships and small datasets but do not provide an output that has an epidemiologic interpretation. METHODS: A small set of clinical variables (CVs) for stage-1 non-small cell lung cancer patients was used to evaluate an approach for using SL methods as a preprocessing step for survival analysis. A stochastic method of training a probabilistic neural network (PNN) was used with differential evolution (DE) optimization. Survival scores were derived stochastically by combining CVs with the PNN. Patients (n = 151) were dichotomized into favorable (n = 92) and unfavorable (n = 59) survival outcome groups. These PNN derived scores were used with logistic regression (LR) modeling to predict favorable survival outcome and were integrated into the survival analysis (i.e. Kaplan-Meier analysis and Cox regression). The hybrid modeling was compared with the respective modeling using raw CVs. The area under the receiver operating characteristic curve (Az) was used to compare model predictive capability. Odds ratios (ORs) and hazard ratios (HRs) were used to compare disease associations with 95% confidence intervals (CIs). RESULTS: The LR model with the best predictive capability gave Az = 0.703. While controlling for gender and tumor grade, the OR = 0.63 (CI: 0.43, 0.91) per standard deviation (SD) increase in age indicates increasing age confers unfavorable outcome. The hybrid LR model gave Az = 0.778 by combining age and tumor grade with the PNN and controlling for gender. The PNN score and age translate inversely with respect to risk. The OR = 0.27 (CI: 0.14, 0.53) per SD increase in PNN score indicates those patients with decreased score confer unfavorable outcome. The tumor grade adjusted hazard for patients above the median age compared with those below the median was HR = 1.78 (CI: 1.06, 3.02), whereas the hazard for those patients below the median PNN score compared to those above the median was HR = 4.0 (CI: 2.13, 7.14). CONCLUSION: We have provided preliminary evidence showing that the SL preprocessing may provide benefits in comparison with accepted approaches. The work will require further evaluation with varying datasets to confirm these findings.
format Online
Article
Text
id pubmed-3280940
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32809402012-02-17 Statistical learning methods as a preprocessing step for survival analysis: evaluation of concept using lung cancer data Behera, Madhusmita Fowler, Erin E Owonikoko, Taofeek K Land, Walker H Mayfield, William Chen, Zhengjia Khuri, Fadlo R Ramalingam, Suresh S Heine, John J Biomed Eng Online Research BACKGROUND: Statistical learning (SL) techniques can address non-linear relationships and small datasets but do not provide an output that has an epidemiologic interpretation. METHODS: A small set of clinical variables (CVs) for stage-1 non-small cell lung cancer patients was used to evaluate an approach for using SL methods as a preprocessing step for survival analysis. A stochastic method of training a probabilistic neural network (PNN) was used with differential evolution (DE) optimization. Survival scores were derived stochastically by combining CVs with the PNN. Patients (n = 151) were dichotomized into favorable (n = 92) and unfavorable (n = 59) survival outcome groups. These PNN derived scores were used with logistic regression (LR) modeling to predict favorable survival outcome and were integrated into the survival analysis (i.e. Kaplan-Meier analysis and Cox regression). The hybrid modeling was compared with the respective modeling using raw CVs. The area under the receiver operating characteristic curve (Az) was used to compare model predictive capability. Odds ratios (ORs) and hazard ratios (HRs) were used to compare disease associations with 95% confidence intervals (CIs). RESULTS: The LR model with the best predictive capability gave Az = 0.703. While controlling for gender and tumor grade, the OR = 0.63 (CI: 0.43, 0.91) per standard deviation (SD) increase in age indicates increasing age confers unfavorable outcome. The hybrid LR model gave Az = 0.778 by combining age and tumor grade with the PNN and controlling for gender. The PNN score and age translate inversely with respect to risk. The OR = 0.27 (CI: 0.14, 0.53) per SD increase in PNN score indicates those patients with decreased score confer unfavorable outcome. The tumor grade adjusted hazard for patients above the median age compared with those below the median was HR = 1.78 (CI: 1.06, 3.02), whereas the hazard for those patients below the median PNN score compared to those above the median was HR = 4.0 (CI: 2.13, 7.14). CONCLUSION: We have provided preliminary evidence showing that the SL preprocessing may provide benefits in comparison with accepted approaches. The work will require further evaluation with varying datasets to confirm these findings. BioMed Central 2011-11-08 /pmc/articles/PMC3280940/ /pubmed/22067671 http://dx.doi.org/10.1186/1475-925X-10-97 Text en Copyright ©2011 Behera et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Behera, Madhusmita
Fowler, Erin E
Owonikoko, Taofeek K
Land, Walker H
Mayfield, William
Chen, Zhengjia
Khuri, Fadlo R
Ramalingam, Suresh S
Heine, John J
Statistical learning methods as a preprocessing step for survival analysis: evaluation of concept using lung cancer data
title Statistical learning methods as a preprocessing step for survival analysis: evaluation of concept using lung cancer data
title_full Statistical learning methods as a preprocessing step for survival analysis: evaluation of concept using lung cancer data
title_fullStr Statistical learning methods as a preprocessing step for survival analysis: evaluation of concept using lung cancer data
title_full_unstemmed Statistical learning methods as a preprocessing step for survival analysis: evaluation of concept using lung cancer data
title_short Statistical learning methods as a preprocessing step for survival analysis: evaluation of concept using lung cancer data
title_sort statistical learning methods as a preprocessing step for survival analysis: evaluation of concept using lung cancer data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3280940/
https://www.ncbi.nlm.nih.gov/pubmed/22067671
http://dx.doi.org/10.1186/1475-925X-10-97
work_keys_str_mv AT beheramadhusmita statisticallearningmethodsasapreprocessingstepforsurvivalanalysisevaluationofconceptusinglungcancerdata
AT fowlererine statisticallearningmethodsasapreprocessingstepforsurvivalanalysisevaluationofconceptusinglungcancerdata
AT owonikokotaofeekk statisticallearningmethodsasapreprocessingstepforsurvivalanalysisevaluationofconceptusinglungcancerdata
AT landwalkerh statisticallearningmethodsasapreprocessingstepforsurvivalanalysisevaluationofconceptusinglungcancerdata
AT mayfieldwilliam statisticallearningmethodsasapreprocessingstepforsurvivalanalysisevaluationofconceptusinglungcancerdata
AT chenzhengjia statisticallearningmethodsasapreprocessingstepforsurvivalanalysisevaluationofconceptusinglungcancerdata
AT khurifadlor statisticallearningmethodsasapreprocessingstepforsurvivalanalysisevaluationofconceptusinglungcancerdata
AT ramalingamsureshs statisticallearningmethodsasapreprocessingstepforsurvivalanalysisevaluationofconceptusinglungcancerdata
AT heinejohnj statisticallearningmethodsasapreprocessingstepforsurvivalanalysisevaluationofconceptusinglungcancerdata