Cargando…

A genetic programming approach to development of clinical prediction models: A case study in symptomatic cardiovascular disease

BACKGROUND: Genetic programming (GP) is an evolutionary computing methodology capable of identifying complex, non-linear patterns in large data sets. Despite the potential advantages of GP over more typical, frequentist statistical approach methods, its applications to survival analyses are rare, at...

Descripción completa

Detalles Bibliográficos
Autores principales: Bannister, Christian A., Halcox, Julian P., Currie, Craig J., Preece, Alun, Spasić, Irena
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6122798/
https://www.ncbi.nlm.nih.gov/pubmed/30180175
http://dx.doi.org/10.1371/journal.pone.0202685
_version_ 1783352729672876032
author Bannister, Christian A.
Halcox, Julian P.
Currie, Craig J.
Preece, Alun
Spasić, Irena
author_facet Bannister, Christian A.
Halcox, Julian P.
Currie, Craig J.
Preece, Alun
Spasić, Irena
author_sort Bannister, Christian A.
collection PubMed
description BACKGROUND: Genetic programming (GP) is an evolutionary computing methodology capable of identifying complex, non-linear patterns in large data sets. Despite the potential advantages of GP over more typical, frequentist statistical approach methods, its applications to survival analyses are rare, at best. The aim of this study was to determine the utility of GP for the automatic development of clinical prediction models. METHODS: We compared GP against the commonly used Cox regression technique in terms of the development and performance of a cardiovascular risk score using data from the SMART study, a prospective cohort study of patients with symptomatic cardiovascular disease. The composite endpoint was cardiovascular death, non-fatal stroke, and myocardial infarction. A total of 3,873 patients aged 19–82 years were enrolled in the study 1996–2006. The cohort was split 70:30 into derivation and validation sets. The derivation set was used for development of both GP and Cox regression models. These models were then used to predict the discrete hazards at t = 1, 3, and 5 years. The predictive ability of both models was evaluated in terms of their risk discrimination and calibration using the validation set. RESULTS: The discrimination of both models was comparable. At time points t = 1, 3, and 5 years the C-index was 0.59, 0.69, 0.64 and 0.66, 0.70, 0.70 for the GP and Cox regression models respectively. At the same time points, the calibration of both models, which was assessed using calibration plots and the generalization of the Hosmer-Lemeshow test statistic, was also comparable, but with the Cox model being better calibrated to the validation data. CONCLUSION: Using empirical data, we demonstrated that a prediction model developed automatically by GP has predictive ability comparable to that of manually tuned Cox regression. The GP model was more complex, but it was developed in a fully automated way and comprised fewer covariates. Furthermore, it did not require the expertise normally needed for its derivation, thereby alleviating the knowledge elicitation bottleneck. Overall, GP demonstrated considerable potential as a method for the automated development of clinical prediction models for diagnostic and prognostic purposes.
format Online
Article
Text
id pubmed-6122798
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-61227982018-09-16 A genetic programming approach to development of clinical prediction models: A case study in symptomatic cardiovascular disease Bannister, Christian A. Halcox, Julian P. Currie, Craig J. Preece, Alun Spasić, Irena PLoS One Research Article BACKGROUND: Genetic programming (GP) is an evolutionary computing methodology capable of identifying complex, non-linear patterns in large data sets. Despite the potential advantages of GP over more typical, frequentist statistical approach methods, its applications to survival analyses are rare, at best. The aim of this study was to determine the utility of GP for the automatic development of clinical prediction models. METHODS: We compared GP against the commonly used Cox regression technique in terms of the development and performance of a cardiovascular risk score using data from the SMART study, a prospective cohort study of patients with symptomatic cardiovascular disease. The composite endpoint was cardiovascular death, non-fatal stroke, and myocardial infarction. A total of 3,873 patients aged 19–82 years were enrolled in the study 1996–2006. The cohort was split 70:30 into derivation and validation sets. The derivation set was used for development of both GP and Cox regression models. These models were then used to predict the discrete hazards at t = 1, 3, and 5 years. The predictive ability of both models was evaluated in terms of their risk discrimination and calibration using the validation set. RESULTS: The discrimination of both models was comparable. At time points t = 1, 3, and 5 years the C-index was 0.59, 0.69, 0.64 and 0.66, 0.70, 0.70 for the GP and Cox regression models respectively. At the same time points, the calibration of both models, which was assessed using calibration plots and the generalization of the Hosmer-Lemeshow test statistic, was also comparable, but with the Cox model being better calibrated to the validation data. CONCLUSION: Using empirical data, we demonstrated that a prediction model developed automatically by GP has predictive ability comparable to that of manually tuned Cox regression. The GP model was more complex, but it was developed in a fully automated way and comprised fewer covariates. Furthermore, it did not require the expertise normally needed for its derivation, thereby alleviating the knowledge elicitation bottleneck. Overall, GP demonstrated considerable potential as a method for the automated development of clinical prediction models for diagnostic and prognostic purposes. Public Library of Science 2018-09-04 /pmc/articles/PMC6122798/ /pubmed/30180175 http://dx.doi.org/10.1371/journal.pone.0202685 Text en © 2018 Bannister et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Bannister, Christian A.
Halcox, Julian P.
Currie, Craig J.
Preece, Alun
Spasić, Irena
A genetic programming approach to development of clinical prediction models: A case study in symptomatic cardiovascular disease
title A genetic programming approach to development of clinical prediction models: A case study in symptomatic cardiovascular disease
title_full A genetic programming approach to development of clinical prediction models: A case study in symptomatic cardiovascular disease
title_fullStr A genetic programming approach to development of clinical prediction models: A case study in symptomatic cardiovascular disease
title_full_unstemmed A genetic programming approach to development of clinical prediction models: A case study in symptomatic cardiovascular disease
title_short A genetic programming approach to development of clinical prediction models: A case study in symptomatic cardiovascular disease
title_sort genetic programming approach to development of clinical prediction models: a case study in symptomatic cardiovascular disease
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6122798/
https://www.ncbi.nlm.nih.gov/pubmed/30180175
http://dx.doi.org/10.1371/journal.pone.0202685
work_keys_str_mv AT bannisterchristiana ageneticprogrammingapproachtodevelopmentofclinicalpredictionmodelsacasestudyinsymptomaticcardiovasculardisease
AT halcoxjulianp ageneticprogrammingapproachtodevelopmentofclinicalpredictionmodelsacasestudyinsymptomaticcardiovasculardisease
AT curriecraigj ageneticprogrammingapproachtodevelopmentofclinicalpredictionmodelsacasestudyinsymptomaticcardiovasculardisease
AT preecealun ageneticprogrammingapproachtodevelopmentofclinicalpredictionmodelsacasestudyinsymptomaticcardiovasculardisease
AT spasicirena ageneticprogrammingapproachtodevelopmentofclinicalpredictionmodelsacasestudyinsymptomaticcardiovasculardisease
AT bannisterchristiana geneticprogrammingapproachtodevelopmentofclinicalpredictionmodelsacasestudyinsymptomaticcardiovasculardisease
AT halcoxjulianp geneticprogrammingapproachtodevelopmentofclinicalpredictionmodelsacasestudyinsymptomaticcardiovasculardisease
AT curriecraigj geneticprogrammingapproachtodevelopmentofclinicalpredictionmodelsacasestudyinsymptomaticcardiovasculardisease
AT preecealun geneticprogrammingapproachtodevelopmentofclinicalpredictionmodelsacasestudyinsymptomaticcardiovasculardisease
AT spasicirena geneticprogrammingapproachtodevelopmentofclinicalpredictionmodelsacasestudyinsymptomaticcardiovasculardisease