Cargando…

Maximum Entropy Exploration in Contextual Bandits with Neural Networks and Energy Based Models

Contextual bandits can solve a huge range of real-world problems. However, current popular algorithms to solve them either rely on linear models or unreliable uncertainty estimation in non-linear models, which are required to deal with the exploration–exploitation trade-off. Inspired by theories of...

Descripción completa

Detalles Bibliográficos
Autores principales: Elwood, Adam, Leonardi, Marco, Mohamed, Ashraf, Rozza, Alessandro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9955972/
https://www.ncbi.nlm.nih.gov/pubmed/36832555
http://dx.doi.org/10.3390/e25020188