Cargando…
Entropic Regularization of Markov Decision Processes
An optimal feedback controller for a given Markov decision process (MDP) can in principle be synthesized by value or policy iteration. However, if the system dynamics and the reward function are unknown, a learning agent must discover an optimal controller via direct interaction with the environment...
Autores principales: | Belousov, Boris, Peters, Jan |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7515171/ https://www.ncbi.nlm.nih.gov/pubmed/33267388 http://dx.doi.org/10.3390/e21070674 |
Ejemplares similares
-
Soft Quantization Using Entropic Regularization
por: Lakshmanan, Rajmadan, et al.
Publicado: (2023) -
Mixture of Experts with Entropic Regularization for Data Classification
por: Peralta, Billy, et al.
Publicado: (2019) -
Adversarially Robust Learning via Entropic Regularization
por: Jagatap, Gauri, et al.
Publicado: (2022) -
Markov decision processes with their applications /
por: Hu, Qiying
Publicado: (2008) -
Competitive Markov decision processes
por: Filar, Jerzy A., 1949-
Publicado: (1997)