Cargando…

An immediate-return reinforcement learning for the atypical Markov decision processes

The atypical Markov decision processes (MDPs) are decision-making for maximizing the immediate returns in only one state transition. Many complex dynamic problems can be regarded as the atypical MDPs, e.g., football trajectory control, approximations of the compound Poincaré maps, and parameter iden...

Descripción completa

Detalles Bibliográficos
Autores principales: Pan, Zebang, Wen, Guilin, Tan, Zhao, Yin, Shan, Hu, Xiaoyan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9793950/
https://www.ncbi.nlm.nih.gov/pubmed/36582302
http://dx.doi.org/10.3389/fnbot.2022.1012427