Cargando…

Learning and Planning for Time-Varying MDPs Using Maximum Likelihood Estimation

This paper proposes a formal approach to online learning and planning for agents operating in a priori unknown, time-varying environments. The proposed method computes the maximally likely model of the environment, given the observations about the environment made by an agent earlier in the system r...

Descripción completa

Detalles Bibliográficos
Autores principales: Ornik, Melkior, Topcu, Ufuk
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8739185/
https://www.ncbi.nlm.nih.gov/pubmed/35002545