Cargando…

Ensemble Estimation of Information Divergence †

Recent work has focused on the problem of nonparametric estimation of information divergence functionals between two continuous random variables. Many existing approaches require either restrictive assumptions about the density support set or difficult calculations at the support set boundary which...

Descripción completa

Detalles Bibliográficos
Autores principales: Moon, Kevin R., Sricharan, Kumar, Greenewald, Kristjan, Hero, Alfred O.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7513085/
https://www.ncbi.nlm.nih.gov/pubmed/33265649
http://dx.doi.org/10.3390/e20080560
_version_ 1783586306570321920
author Moon, Kevin R.
Sricharan, Kumar
Greenewald, Kristjan
Hero, Alfred O.
author_facet Moon, Kevin R.
Sricharan, Kumar
Greenewald, Kristjan
Hero, Alfred O.
author_sort Moon, Kevin R.
collection PubMed
description Recent work has focused on the problem of nonparametric estimation of information divergence functionals between two continuous random variables. Many existing approaches require either restrictive assumptions about the density support set or difficult calculations at the support set boundary which must be known a priori. The mean squared error (MSE) convergence rate of a leave-one-out kernel density plug-in divergence functional estimator for general bounded density support sets is derived where knowledge of the support boundary, and therefore, the boundary correction is not required. The theory of optimally weighted ensemble estimation is generalized to derive a divergence estimator that achieves the parametric rate when the densities are sufficiently smooth. Guidelines for the tuning parameter selection and the asymptotic distribution of this estimator are provided. Based on the theory, an empirical estimator of Rényi- [Formula: see text] divergence is proposed that greatly outperforms the standard kernel density plug-in estimator in terms of mean squared error, especially in high dimensions. The estimator is shown to be robust to the choice of tuning parameters. We show extensive simulation results that verify the theoretical results of our paper. Finally, we apply the proposed estimator to estimate the bounds on the Bayes error rate of a cell classification problem.
format Online
Article
Text
id pubmed-7513085
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-75130852020-11-09 Ensemble Estimation of Information Divergence † Moon, Kevin R. Sricharan, Kumar Greenewald, Kristjan Hero, Alfred O. Entropy (Basel) Article Recent work has focused on the problem of nonparametric estimation of information divergence functionals between two continuous random variables. Many existing approaches require either restrictive assumptions about the density support set or difficult calculations at the support set boundary which must be known a priori. The mean squared error (MSE) convergence rate of a leave-one-out kernel density plug-in divergence functional estimator for general bounded density support sets is derived where knowledge of the support boundary, and therefore, the boundary correction is not required. The theory of optimally weighted ensemble estimation is generalized to derive a divergence estimator that achieves the parametric rate when the densities are sufficiently smooth. Guidelines for the tuning parameter selection and the asymptotic distribution of this estimator are provided. Based on the theory, an empirical estimator of Rényi- [Formula: see text] divergence is proposed that greatly outperforms the standard kernel density plug-in estimator in terms of mean squared error, especially in high dimensions. The estimator is shown to be robust to the choice of tuning parameters. We show extensive simulation results that verify the theoretical results of our paper. Finally, we apply the proposed estimator to estimate the bounds on the Bayes error rate of a cell classification problem. MDPI 2018-07-27 /pmc/articles/PMC7513085/ /pubmed/33265649 http://dx.doi.org/10.3390/e20080560 Text en © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Moon, Kevin R.
Sricharan, Kumar
Greenewald, Kristjan
Hero, Alfred O.
Ensemble Estimation of Information Divergence †
title Ensemble Estimation of Information Divergence †
title_full Ensemble Estimation of Information Divergence †
title_fullStr Ensemble Estimation of Information Divergence †
title_full_unstemmed Ensemble Estimation of Information Divergence †
title_short Ensemble Estimation of Information Divergence †
title_sort ensemble estimation of information divergence †
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7513085/
https://www.ncbi.nlm.nih.gov/pubmed/33265649
http://dx.doi.org/10.3390/e20080560
work_keys_str_mv AT moonkevinr ensembleestimationofinformationdivergence
AT sricharankumar ensembleestimationofinformationdivergence
AT greenewaldkristjan ensembleestimationofinformationdivergence
AT heroalfredo ensembleestimationofinformationdivergence