Cargando…

Convergence Rates for Empirical Estimation of Binary Classification Bounds

Bounding the best achievable error probability for binary classification problems is relevant to many applications including machine learning, signal processing, and information theory. Many bounds on the Bayes binary classification error rate depend on information divergences between the pair of cl...

Descripción completa

Detalles Bibliográficos
Autores principales: Sekeh, Salimeh Yasaei, Noshad, Morteza, Moon, Kevin R., Hero, Alfred O.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7514489/
http://dx.doi.org/10.3390/e21121144
_version_ 1783586599630536704
author Sekeh, Salimeh Yasaei
Noshad, Morteza
Moon, Kevin R.
Hero, Alfred O.
author_facet Sekeh, Salimeh Yasaei
Noshad, Morteza
Moon, Kevin R.
Hero, Alfred O.
author_sort Sekeh, Salimeh Yasaei
collection PubMed
description Bounding the best achievable error probability for binary classification problems is relevant to many applications including machine learning, signal processing, and information theory. Many bounds on the Bayes binary classification error rate depend on information divergences between the pair of class distributions. Recently, the Henze–Penrose (HP) divergence has been proposed for bounding classification error probability. We consider the problem of empirically estimating the HP-divergence from random samples. We derive a bound on the convergence rate for the Friedman–Rafsky (FR) estimator of the HP-divergence, which is related to a multivariate runs statistic for testing between two distributions. The FR estimator is derived from a multicolored Euclidean minimal spanning tree (MST) that spans the merged samples. We obtain a concentration inequality for the Friedman–Rafsky estimator of the Henze–Penrose divergence. We validate our results experimentally and illustrate their application to real datasets.
format Online
Article
Text
id pubmed-7514489
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-75144892020-11-09 Convergence Rates for Empirical Estimation of Binary Classification Bounds Sekeh, Salimeh Yasaei Noshad, Morteza Moon, Kevin R. Hero, Alfred O. Entropy (Basel) Article Bounding the best achievable error probability for binary classification problems is relevant to many applications including machine learning, signal processing, and information theory. Many bounds on the Bayes binary classification error rate depend on information divergences between the pair of class distributions. Recently, the Henze–Penrose (HP) divergence has been proposed for bounding classification error probability. We consider the problem of empirically estimating the HP-divergence from random samples. We derive a bound on the convergence rate for the Friedman–Rafsky (FR) estimator of the HP-divergence, which is related to a multivariate runs statistic for testing between two distributions. The FR estimator is derived from a multicolored Euclidean minimal spanning tree (MST) that spans the merged samples. We obtain a concentration inequality for the Friedman–Rafsky estimator of the Henze–Penrose divergence. We validate our results experimentally and illustrate their application to real datasets. MDPI 2019-11-23 /pmc/articles/PMC7514489/ http://dx.doi.org/10.3390/e21121144 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Sekeh, Salimeh Yasaei
Noshad, Morteza
Moon, Kevin R.
Hero, Alfred O.
Convergence Rates for Empirical Estimation of Binary Classification Bounds
title Convergence Rates for Empirical Estimation of Binary Classification Bounds
title_full Convergence Rates for Empirical Estimation of Binary Classification Bounds
title_fullStr Convergence Rates for Empirical Estimation of Binary Classification Bounds
title_full_unstemmed Convergence Rates for Empirical Estimation of Binary Classification Bounds
title_short Convergence Rates for Empirical Estimation of Binary Classification Bounds
title_sort convergence rates for empirical estimation of binary classification bounds
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7514489/
http://dx.doi.org/10.3390/e21121144
work_keys_str_mv AT sekehsalimehyasaei convergenceratesforempiricalestimationofbinaryclassificationbounds
AT noshadmorteza convergenceratesforempiricalestimationofbinaryclassificationbounds
AT moonkevinr convergenceratesforempiricalestimationofbinaryclassificationbounds
AT heroalfredo convergenceratesforempiricalestimationofbinaryclassificationbounds