Cargando…
Adaptive Baseline Enhances EM-Based Policy Search: Validation in a View-Based Positioning Task of a Smartphone Balancer
EM-based policy search methods estimate a lower bound of the expected return from the histories of episodes and iteratively update the policy parameters using the maximum of a lower bound of expected return, which makes gradient calculation and learning rate tuning unnecessary. Previous algorithms l...
Autores principales: | Wang, Jiexin, Uchibe, Eiji, Doya, Kenji |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5256123/ https://www.ncbi.nlm.nih.gov/pubmed/28167910 http://dx.doi.org/10.3389/fnbot.2017.00001 |
Ejemplares similares
-
Evaluation of linearly solvable Markov decision process with dynamic model learning in a mobile robot navigation task
por: Kinjo, Ken, et al.
Publicado: (2013) -
Scaled free-energy based reinforcement learning for robust and efficient learning in high-dimensional state spaces
por: Elfwing, Stefan, et al.
Publicado: (2013) -
The effects on dynamic balance of dual-tasking using smartphone
functions
por: Hyong, In Hyouk
Publicado: (2015) -
Cooperative and Competitive Reinforcement and Imitation Learning for a Mixture of Heterogeneous Learning Modules
por: Uchibe, Eiji
Publicado: (2018) -
Balanced difficulty task finder: an adaptive recommendation method for learning tasks based on the concept of state of flow
por: Yazidi, Anis, et al.
Publicado: (2020)