Cargando…

Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation

Cross-Validation (CV), and out-of-sample performance-estimation protocols in general, are often employed both for (a) selecting the optimal combination of algorithms and values of hyper-parameters (called a configuration) for producing the final predictive model, and (b) estimating the predictive pe...

Descripción completa

Detalles Bibliográficos
Autores principales:	Tsamardinos, Ioannis, Greasidou, Elissavet, Borboudakis, Giorgos
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer US 2018
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6191021/ https://www.ncbi.nlm.nih.gov/pubmed/30393425 http://dx.doi.org/10.1007/s10994-018-5714-4

_version_	1783363654107791360
author	Tsamardinos, Ioannis Greasidou, Elissavet Borboudakis, Giorgos
author_facet	Tsamardinos, Ioannis Greasidou, Elissavet Borboudakis, Giorgos
author_sort	Tsamardinos, Ioannis
collection	PubMed
description	Cross-Validation (CV), and out-of-sample performance-estimation protocols in general, are often employed both for (a) selecting the optimal combination of algorithms and values of hyper-parameters (called a configuration) for producing the final predictive model, and (b) estimating the predictive performance of the final model. However, the cross-validated performance of the best configuration is optimistically biased. We present an efficient bootstrap method that corrects for the bias, called Bootstrap Bias Corrected CV (BBC-CV). BBC-CV’s main idea is to bootstrap the whole process of selecting the best-performing configuration on the out-of-sample predictions of each configuration, without additional training of models. In comparison to the alternatives, namely the nested cross-validation (Varma and Simon in BMC Bioinform 7(1):91, 2006) and a method by Tibshirani and Tibshirani (Ann Appl Stat 822–829, 2009), BBC-CV is computationally more efficient, has smaller variance and bias, and is applicable to any metric of performance (accuracy, AUC, concordance index, mean squared error). Subsequently, we employ again the idea of bootstrapping the out-of-sample predictions to speed up the CV process. Specifically, using a bootstrap-based statistical criterion we stop training of models on new folds of inferior (with high probability) configurations. We name the method Bootstrap Bias Corrected with Dropping CV (BBCD-CV) that is both efficient and provides accurate performance estimates.
format	Online Article Text
id	pubmed-6191021
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	Springer US
record_format	MEDLINE/PubMed
spelling	pubmed-61910212018-10-31 Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation Tsamardinos, Ioannis Greasidou, Elissavet Borboudakis, Giorgos Mach Learn Article Cross-Validation (CV), and out-of-sample performance-estimation protocols in general, are often employed both for (a) selecting the optimal combination of algorithms and values of hyper-parameters (called a configuration) for producing the final predictive model, and (b) estimating the predictive performance of the final model. However, the cross-validated performance of the best configuration is optimistically biased. We present an efficient bootstrap method that corrects for the bias, called Bootstrap Bias Corrected CV (BBC-CV). BBC-CV’s main idea is to bootstrap the whole process of selecting the best-performing configuration on the out-of-sample predictions of each configuration, without additional training of models. In comparison to the alternatives, namely the nested cross-validation (Varma and Simon in BMC Bioinform 7(1):91, 2006) and a method by Tibshirani and Tibshirani (Ann Appl Stat 822–829, 2009), BBC-CV is computationally more efficient, has smaller variance and bias, and is applicable to any metric of performance (accuracy, AUC, concordance index, mean squared error). Subsequently, we employ again the idea of bootstrapping the out-of-sample predictions to speed up the CV process. Specifically, using a bootstrap-based statistical criterion we stop training of models on new folds of inferior (with high probability) configurations. We name the method Bootstrap Bias Corrected with Dropping CV (BBCD-CV) that is both efficient and provides accurate performance estimates. Springer US 2018-05-09 2018 /pmc/articles/PMC6191021/ /pubmed/30393425 http://dx.doi.org/10.1007/s10994-018-5714-4 Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle	Article Tsamardinos, Ioannis Greasidou, Elissavet Borboudakis, Giorgos Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation
title	Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation
title_full	Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation
title_fullStr	Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation
title_full_unstemmed	Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation
title_short	Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation
title_sort	bootstrapping the out-of-sample predictions for efficient and accurate cross-validation
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6191021/ https://www.ncbi.nlm.nih.gov/pubmed/30393425 http://dx.doi.org/10.1007/s10994-018-5714-4
work_keys_str_mv	AT tsamardinosioannis bootstrappingtheoutofsamplepredictionsforefficientandaccuratecrossvalidation AT greasidouelissavet bootstrappingtheoutofsamplepredictionsforefficientandaccuratecrossvalidation AT borboudakisgiorgos bootstrappingtheoutofsamplepredictionsforefficientandaccuratecrossvalidation

Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation

Ejemplares similares