Cargando…
An immediate-return reinforcement learning for the atypical Markov decision processes
The atypical Markov decision processes (MDPs) are decision-making for maximizing the immediate returns in only one state transition. Many complex dynamic problems can be regarded as the atypical MDPs, e.g., football trajectory control, approximations of the compound Poincaré maps, and parameter iden...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9793950/ https://www.ncbi.nlm.nih.gov/pubmed/36582302 http://dx.doi.org/10.3389/fnbot.2022.1012427 |