Cargando…

Optimistic Value Iteration

Markov decision processes are widely used for planning and verification in settings that combine controllable or adversarial choices with probabilistic behaviour. The standard analysis algorithm, value iteration, only provides lower bounds on infinite-horizon probabilities and rewards. Two “sound” v...

Descripción completa

Detalles Bibliográficos
Autores principales: Hartmanns, Arnd, Kaminski, Benjamin Lucien
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7363440/
http://dx.doi.org/10.1007/978-3-030-53291-8_26