Cargando…

On Gap-Based Lower Bounding Techniques for Best-Arm Identification

In this paper, we consider techniques for establishing lower bounds on the number of arm pulls for best-arm identification in the multi-armed bandit problem. While a recent divergence-based approach was shown to provide improvements over an older gap-based approach, we show that the latter can be re...

Descripción completa

Detalles Bibliográficos
Autores principales: Truong, Lan V., Scarlett, Jonathan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7517353/
https://www.ncbi.nlm.nih.gov/pubmed/33286559
http://dx.doi.org/10.3390/e22070788
_version_ 1783587210698686464
author Truong, Lan V.
Scarlett, Jonathan
author_facet Truong, Lan V.
Scarlett, Jonathan
author_sort Truong, Lan V.
collection PubMed
description In this paper, we consider techniques for establishing lower bounds on the number of arm pulls for best-arm identification in the multi-armed bandit problem. While a recent divergence-based approach was shown to provide improvements over an older gap-based approach, we show that the latter can be refined to match the former (up to constant factors) in many cases of interest under Bernoulli rewards, including the case that the rewards are bounded away from zero and one. Together with existing upper bounds, this indicates that the divergence-based and gap-based approaches are both effective for establishing sample complexity lower bounds for best-arm identification.
format Online
Article
Text
id pubmed-7517353
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-75173532020-11-09 On Gap-Based Lower Bounding Techniques for Best-Arm Identification Truong, Lan V. Scarlett, Jonathan Entropy (Basel) Article In this paper, we consider techniques for establishing lower bounds on the number of arm pulls for best-arm identification in the multi-armed bandit problem. While a recent divergence-based approach was shown to provide improvements over an older gap-based approach, we show that the latter can be refined to match the former (up to constant factors) in many cases of interest under Bernoulli rewards, including the case that the rewards are bounded away from zero and one. Together with existing upper bounds, this indicates that the divergence-based and gap-based approaches are both effective for establishing sample complexity lower bounds for best-arm identification. MDPI 2020-07-20 /pmc/articles/PMC7517353/ /pubmed/33286559 http://dx.doi.org/10.3390/e22070788 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Truong, Lan V.
Scarlett, Jonathan
On Gap-Based Lower Bounding Techniques for Best-Arm Identification
title On Gap-Based Lower Bounding Techniques for Best-Arm Identification
title_full On Gap-Based Lower Bounding Techniques for Best-Arm Identification
title_fullStr On Gap-Based Lower Bounding Techniques for Best-Arm Identification
title_full_unstemmed On Gap-Based Lower Bounding Techniques for Best-Arm Identification
title_short On Gap-Based Lower Bounding Techniques for Best-Arm Identification
title_sort on gap-based lower bounding techniques for best-arm identification
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7517353/
https://www.ncbi.nlm.nih.gov/pubmed/33286559
http://dx.doi.org/10.3390/e22070788
work_keys_str_mv AT truonglanv ongapbasedlowerboundingtechniquesforbestarmidentification
AT scarlettjonathan ongapbasedlowerboundingtechniquesforbestarmidentification