Cargando…

Adversarial Thresholding Semi-Bandits

The classical multi-armed bandit is one of the most common examples of sequential decision-making, either by trading-off between exploiting and exploring arms to maximise some payoff or purely exploring arms until the optimal arm is identified. In particular, a bandit player wanting to only pull arm...

Descripción completa

Detalles Bibliográficos
Autor principal: Bower, Craig Steven
Lenguaje:eng
Publicado: 2021
Materias:
Acceso en línea:http://cds.cern.ch/record/2790271