Cargando…
Learning and Planning for Time-Varying MDPs Using Maximum Likelihood Estimation
This paper proposes a formal approach to online learning and planning for agents operating in a priori unknown, time-varying environments. The proposed method computes the maximally likely model of the environment, given the observations about the environment made by an agent earlier in the system r...
Autores principales: | Ornik, Melkior, Topcu, Ufuk |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8739185/ https://www.ncbi.nlm.nih.gov/pubmed/35002545 |
Ejemplares similares
-
Scenario-Based Verification of Uncertain MDPs
por: Cubuktepe, Murat, et al.
Publicado: (2020) -
Qualitative Controller Synthesis for Consumption Markov Decision Processes
por: Blahoudek, František, et al.
Publicado: (2020) -
Good-for-MDPs Automata for Probabilistic Analysis and Reinforcement Learning
por: Hahn, Ernst Moritz, et al.
Publicado: (2020) -
Simple Strategies in Multi-Objective MDPs
por: Delgrange, Florent, et al.
Publicado: (2020) -
Maximum Penalized Likelihood Estimation
por: LaRiccia, Vincent N, et al.
Publicado: (2009)