Cargando…

Can Hyperparameter Tuning Improve the Performance of a Super Learner?: A Case Study

BACKGROUND: Super learning is an ensemble machine learning approach used increasingly as an alternative to classical prediction techniques. When implementing super learning, however, not tuning the hyperparameters of the algorithms in it may adversely affect the performance of the super learner. MET...

Descripción completa

Detalles Bibliográficos
Autores principales: Wong, Jenna, Manderson, Travis, Abrahamowicz, Michal, Buckeridge, David L, Tamblyn, Robyn
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Lippincott Williams & Wilkins 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6553550/
https://www.ncbi.nlm.nih.gov/pubmed/30985529
http://dx.doi.org/10.1097/EDE.0000000000001027
Descripción
Sumario:BACKGROUND: Super learning is an ensemble machine learning approach used increasingly as an alternative to classical prediction techniques. When implementing super learning, however, not tuning the hyperparameters of the algorithms in it may adversely affect the performance of the super learner. METHODS: In this case study, we used data from a Canadian electronic prescribing system to predict when primary care physicians prescribed antidepressants for indications other than depression. The analysis included 73,576 antidepressant prescriptions and 373 candidate predictors. We derived two super learners: one using tuned hyperparameter values for each machine learning algorithm identified through an iterative grid search procedure and the other using the default values. We compared the performance of the tuned super learner to that of the super learner using default values (“untuned”) and a carefully constructed logistic regression model from a previous analysis. RESULTS: The tuned super learner had a scaled Brier score (R(2)) of 0.322 (95% [confidence interval] CI = 0.267, 0.362). In comparison, the untuned super learner had a scaled Brier score of 0.309 (95% CI = 0.256, 0.353), corresponding to an efficiency loss of 4% (relative efficiency 0.96; 95% CI = 0.93, 0.99). The previously-derived logistic regression model had a scaled Brier score of 0.307 (95% CI = 0.245, 0.360), corresponding to an efficiency loss of 5% relative to the tuned super learner (relative efficiency 0.95; 95% CI = 0.88, 1.01). CONCLUSIONS: In this case study, hyperparameter tuning produced a super learner that performed slightly better than an untuned super learner. Tuning the hyperparameters of individual algorithms in a super learner may help optimize performance.