Cargando…

Safe reinforcement learning under temporal logic with reward design and quantum action selection

This paper proposes an advanced Reinforcement Learning (RL) method, incorporating reward-shaping, safety value functions, and a quantum action selection algorithm. The method is model-free and can synthesize a finite policy that maximizes the probability of satisfying a complex task. Although RL is...

Descripción completa

Detalles Bibliográficos
Autores principales: Cai, Mingyu, Xiao, Shaoping, Li, Junchao, Kan, Zhen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9894922/
https://www.ncbi.nlm.nih.gov/pubmed/36732441
http://dx.doi.org/10.1038/s41598-023-28582-4