Cargando…
On Gap-Based Lower Bounding Techniques for Best-Arm Identification
In this paper, we consider techniques for establishing lower bounds on the number of arm pulls for best-arm identification in the multi-armed bandit problem. While a recent divergence-based approach was shown to provide improvements over an older gap-based approach, we show that the latter can be re...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7517353/ https://www.ncbi.nlm.nih.gov/pubmed/33286559 http://dx.doi.org/10.3390/e22070788 |
_version_ | 1783587210698686464 |
---|---|
author | Truong, Lan V. Scarlett, Jonathan |
author_facet | Truong, Lan V. Scarlett, Jonathan |
author_sort | Truong, Lan V. |
collection | PubMed |
description | In this paper, we consider techniques for establishing lower bounds on the number of arm pulls for best-arm identification in the multi-armed bandit problem. While a recent divergence-based approach was shown to provide improvements over an older gap-based approach, we show that the latter can be refined to match the former (up to constant factors) in many cases of interest under Bernoulli rewards, including the case that the rewards are bounded away from zero and one. Together with existing upper bounds, this indicates that the divergence-based and gap-based approaches are both effective for establishing sample complexity lower bounds for best-arm identification. |
format | Online Article Text |
id | pubmed-7517353 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-75173532020-11-09 On Gap-Based Lower Bounding Techniques for Best-Arm Identification Truong, Lan V. Scarlett, Jonathan Entropy (Basel) Article In this paper, we consider techniques for establishing lower bounds on the number of arm pulls for best-arm identification in the multi-armed bandit problem. While a recent divergence-based approach was shown to provide improvements over an older gap-based approach, we show that the latter can be refined to match the former (up to constant factors) in many cases of interest under Bernoulli rewards, including the case that the rewards are bounded away from zero and one. Together with existing upper bounds, this indicates that the divergence-based and gap-based approaches are both effective for establishing sample complexity lower bounds for best-arm identification. MDPI 2020-07-20 /pmc/articles/PMC7517353/ /pubmed/33286559 http://dx.doi.org/10.3390/e22070788 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Truong, Lan V. Scarlett, Jonathan On Gap-Based Lower Bounding Techniques for Best-Arm Identification |
title | On Gap-Based Lower Bounding Techniques for Best-Arm Identification |
title_full | On Gap-Based Lower Bounding Techniques for Best-Arm Identification |
title_fullStr | On Gap-Based Lower Bounding Techniques for Best-Arm Identification |
title_full_unstemmed | On Gap-Based Lower Bounding Techniques for Best-Arm Identification |
title_short | On Gap-Based Lower Bounding Techniques for Best-Arm Identification |
title_sort | on gap-based lower bounding techniques for best-arm identification |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7517353/ https://www.ncbi.nlm.nih.gov/pubmed/33286559 http://dx.doi.org/10.3390/e22070788 |
work_keys_str_mv | AT truonglanv ongapbasedlowerboundingtechniquesforbestarmidentification AT scarlettjonathan ongapbasedlowerboundingtechniquesforbestarmidentification |