Cargando…

Bias-variance decomposition of absolute errors for diagnosing regression models of continuous data

Bias-variance decomposition (BVD) is a powerful tool for understanding and improving data-driven models. It reveals sources of estimation errors. Existing literature has defined BVD for squared error but not absolute error, while absolute error is the more natural error metric and has shown advantag...

Descripción completa

Detalles Bibliográficos
Autor principal: Gao, Jing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8369249/
https://www.ncbi.nlm.nih.gov/pubmed/34430928
http://dx.doi.org/10.1016/j.patter.2021.100309
_version_ 1783739253171159040
author Gao, Jing
author_facet Gao, Jing
author_sort Gao, Jing
collection PubMed
description Bias-variance decomposition (BVD) is a powerful tool for understanding and improving data-driven models. It reveals sources of estimation errors. Existing literature has defined BVD for squared error but not absolute error, while absolute error is the more natural error metric and has shown advantages over squared error in many scientific fields. Here, I analytically derive the absolute-error BVD, empirically investigate its behaviors, and compare that with other error metrics. Different error metrics offer distinctly different perspectives. I find the commonly believed bias/variance trade-off under squared error is often absent under absolute error, and ensembles—a never hurt technique under squared error—could harm performance under absolute error. Compared with squared error, absolute-error BVD better promotes model traits reducing estimation residuals and better illustrates relative importance of different error sources. As data scientists pay increasing attention to uncertainty issues, the technique introduced here can be a useful addition to a data-driven modeler's toolset.
format Online
Article
Text
id pubmed-8369249
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-83692492021-08-23 Bias-variance decomposition of absolute errors for diagnosing regression models of continuous data Gao, Jing Patterns (N Y) Article Bias-variance decomposition (BVD) is a powerful tool for understanding and improving data-driven models. It reveals sources of estimation errors. Existing literature has defined BVD for squared error but not absolute error, while absolute error is the more natural error metric and has shown advantages over squared error in many scientific fields. Here, I analytically derive the absolute-error BVD, empirically investigate its behaviors, and compare that with other error metrics. Different error metrics offer distinctly different perspectives. I find the commonly believed bias/variance trade-off under squared error is often absent under absolute error, and ensembles—a never hurt technique under squared error—could harm performance under absolute error. Compared with squared error, absolute-error BVD better promotes model traits reducing estimation residuals and better illustrates relative importance of different error sources. As data scientists pay increasing attention to uncertainty issues, the technique introduced here can be a useful addition to a data-driven modeler's toolset. Elsevier 2021-07-21 /pmc/articles/PMC8369249/ /pubmed/34430928 http://dx.doi.org/10.1016/j.patter.2021.100309 Text en © 2021 The Author https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Gao, Jing
Bias-variance decomposition of absolute errors for diagnosing regression models of continuous data
title Bias-variance decomposition of absolute errors for diagnosing regression models of continuous data
title_full Bias-variance decomposition of absolute errors for diagnosing regression models of continuous data
title_fullStr Bias-variance decomposition of absolute errors for diagnosing regression models of continuous data
title_full_unstemmed Bias-variance decomposition of absolute errors for diagnosing regression models of continuous data
title_short Bias-variance decomposition of absolute errors for diagnosing regression models of continuous data
title_sort bias-variance decomposition of absolute errors for diagnosing regression models of continuous data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8369249/
https://www.ncbi.nlm.nih.gov/pubmed/34430928
http://dx.doi.org/10.1016/j.patter.2021.100309
work_keys_str_mv AT gaojing biasvariancedecompositionofabsoluteerrorsfordiagnosingregressionmodelsofcontinuousdata