Cargando…

Adaptive Baseline Enhances EM-Based Policy Search: Validation in a View-Based Positioning Task of a Smartphone Balancer

EM-based policy search methods estimate a lower bound of the expected return from the histories of episodes and iteratively update the policy parameters using the maximum of a lower bound of expected return, which makes gradient calculation and learning rate tuning unnecessary. Previous algorithms l...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wang, Jiexin, Uchibe, Eiji, Doya, Kenji
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2017
Materias:	Neuroscience
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5256123/ https://www.ncbi.nlm.nih.gov/pubmed/28167910 http://dx.doi.org/10.3389/fnbot.2017.00001

Internet

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5256123/
https://www.ncbi.nlm.nih.gov/pubmed/28167910
http://dx.doi.org/10.3389/fnbot.2017.00001

Adaptive Baseline Enhances EM-Based Policy Search: Validation in a View-Based Positioning Task of a Smartphone Balancer

Internet

Ejemplares similares