Cargando…

Adversarial Thresholding Semi-Bandits

The classical multi-armed bandit is one of the most common examples of sequential decision-making, either by trading-off between exploiting and exploring arms to maximise some payoff or purely exploring arms until the optimal arm is identified. In particular, a bandit player wanting to only pull arm...

Descripción completa

Detalles Bibliográficos
Autor principal:	Bower, Craig Steven
Lenguaje:	eng
Publicado:	2021
Materias:	Computing and Computers Detectors and Experimental Techniques
Acceso en línea:	http://cds.cern.ch/record/2790271

Internet

http://cds.cern.ch/record/2790271

Adversarial Thresholding Semi-Bandits

Internet

Ejemplares similares