Cargando…

Which Is Better: Holdout or Full-Sample Classifier Design?

Is it better to design a classifier and estimate its error on the full sample or to design a classifier on a training subset and estimate its error on the holdout test subset? Full-sample design provides the better classifier; nevertheless, one might choose holdout with the hope of better error esti...

Descripción completa

Detalles Bibliográficos
Autores principales:	Brun, Marcel, Xu, Qian, Dougherty, Edward R
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer 2007
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3171393/ https://www.ncbi.nlm.nih.gov/pubmed/18483613 http://dx.doi.org/10.1155/2008/297945

Descripción
Sumario:	Is it better to design a classifier and estimate its error on the full sample or to design a classifier on a training subset and estimate its error on the holdout test subset? Full-sample design provides the better classifier; nevertheless, one might choose holdout with the hope of better error estimation. A conservative criterion to decide the best course is to aim at a classifier whose error is less than a given bound. Then the choice between full-sample and holdout designs depends on which possesses the smaller expected bound. Using this criterion, we examine the choice between holdout and several full-sample error estimators using covariance models and a patient-data model. Full-sample design consistently outperforms holdout design. The relation between the two designs is revealed via a decomposition of the expected bound into the sum of the expected true error and the expected conditional standard deviation of the true error.

Which Is Better: Holdout or Full-Sample Classifier Design?

Ejemplares similares