Cargando…

Empirical Frequentist Coverage of Deep Learning Uncertainty Quantification Procedures

Uncertainty quantification for complex deep learning models is increasingly important as these techniques see growing use in high-stakes, real-world settings. Currently, the quality of a model’s uncertainty is evaluated using point-prediction metrics, such as the negative log-likelihood (NLL), expec...

Descripción completa

Detalles Bibliográficos
Autores principales:	Kompa, Benjamin, Snoek, Jasper, Beam, Andrew L.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8700765/ https://www.ncbi.nlm.nih.gov/pubmed/34945914 http://dx.doi.org/10.3390/e23121608

_version_	1784620836667260928
author	Kompa, Benjamin Snoek, Jasper Beam, Andrew L.
author_facet	Kompa, Benjamin Snoek, Jasper Beam, Andrew L.
author_sort	Kompa, Benjamin
collection	PubMed
description	Uncertainty quantification for complex deep learning models is increasingly important as these techniques see growing use in high-stakes, real-world settings. Currently, the quality of a model’s uncertainty is evaluated using point-prediction metrics, such as the negative log-likelihood (NLL), expected calibration error (ECE) or the Brier score on held-out data. Marginal coverage of prediction intervals or sets, a well-known concept in the statistical literature, is an intuitive alternative to these metrics but has yet to be systematically studied for many popular uncertainty quantification techniques for deep learning models. With marginal coverage and the complementary notion of the width of a prediction interval, downstream users of deployed machine learning models can better understand uncertainty quantification both on a global dataset level and on a per-sample basis. In this study, we provide the first large-scale evaluation of the empirical frequentist coverage properties of well-known uncertainty quantification techniques on a suite of regression and classification tasks. We find that, in general, some methods do achieve desirable coverage properties on in distribution samples, but that coverage is not maintained on out-of-distribution data. Our results demonstrate the failings of current uncertainty quantification techniques as dataset shift increases and reinforce coverage as an important metric in developing models for real-world applications.
format	Online Article Text
id	pubmed-8700765
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-87007652021-12-24 Empirical Frequentist Coverage of Deep Learning Uncertainty Quantification Procedures Kompa, Benjamin Snoek, Jasper Beam, Andrew L. Entropy (Basel) Article Uncertainty quantification for complex deep learning models is increasingly important as these techniques see growing use in high-stakes, real-world settings. Currently, the quality of a model’s uncertainty is evaluated using point-prediction metrics, such as the negative log-likelihood (NLL), expected calibration error (ECE) or the Brier score on held-out data. Marginal coverage of prediction intervals or sets, a well-known concept in the statistical literature, is an intuitive alternative to these metrics but has yet to be systematically studied for many popular uncertainty quantification techniques for deep learning models. With marginal coverage and the complementary notion of the width of a prediction interval, downstream users of deployed machine learning models can better understand uncertainty quantification both on a global dataset level and on a per-sample basis. In this study, we provide the first large-scale evaluation of the empirical frequentist coverage properties of well-known uncertainty quantification techniques on a suite of regression and classification tasks. We find that, in general, some methods do achieve desirable coverage properties on in distribution samples, but that coverage is not maintained on out-of-distribution data. Our results demonstrate the failings of current uncertainty quantification techniques as dataset shift increases and reinforce coverage as an important metric in developing models for real-world applications. MDPI 2021-11-30 /pmc/articles/PMC8700765/ /pubmed/34945914 http://dx.doi.org/10.3390/e23121608 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Kompa, Benjamin Snoek, Jasper Beam, Andrew L. Empirical Frequentist Coverage of Deep Learning Uncertainty Quantification Procedures
title	Empirical Frequentist Coverage of Deep Learning Uncertainty Quantification Procedures
title_full	Empirical Frequentist Coverage of Deep Learning Uncertainty Quantification Procedures
title_fullStr	Empirical Frequentist Coverage of Deep Learning Uncertainty Quantification Procedures
title_full_unstemmed	Empirical Frequentist Coverage of Deep Learning Uncertainty Quantification Procedures
title_short	Empirical Frequentist Coverage of Deep Learning Uncertainty Quantification Procedures
title_sort	empirical frequentist coverage of deep learning uncertainty quantification procedures
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8700765/ https://www.ncbi.nlm.nih.gov/pubmed/34945914 http://dx.doi.org/10.3390/e23121608
work_keys_str_mv	AT kompabenjamin empiricalfrequentistcoverageofdeeplearninguncertaintyquantificationprocedures AT snoekjasper empiricalfrequentistcoverageofdeeplearninguncertaintyquantificationprocedures AT beamandrewl empiricalfrequentistcoverageofdeeplearninguncertaintyquantificationprocedures

Empirical Frequentist Coverage of Deep Learning Uncertainty Quantification Procedures

Ejemplares similares