Cargando…

Heuristic Q-learning based on experience replay for three-dimensional path planning of the unmanned aerial vehicle

In order to solve the problem that the existing reinforcement learning algorithm is difficult to converge due to the excessive state space of the three-dimensional path planning of the unmanned aerial vehicle, this article proposes a reinforcement learning algorithm based on the heuristic function a...

Descripción completa

Detalles Bibliográficos
Autores principales:	Xie, Ronglei, Meng, Zhijun, Zhou, Yaoming, Ma, Yunpeng, Wu, Zhe
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	SAGE Publications 2019
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10453672/ https://www.ncbi.nlm.nih.gov/pubmed/31829875 http://dx.doi.org/10.1177/0036850419879024

_version_	1785095994484981760
author	Xie, Ronglei Meng, Zhijun Zhou, Yaoming Ma, Yunpeng Wu, Zhe
author_facet	Xie, Ronglei Meng, Zhijun Zhou, Yaoming Ma, Yunpeng Wu, Zhe
author_sort	Xie, Ronglei
collection	PubMed
description	In order to solve the problem that the existing reinforcement learning algorithm is difficult to converge due to the excessive state space of the three-dimensional path planning of the unmanned aerial vehicle, this article proposes a reinforcement learning algorithm based on the heuristic function and the maximum average reward value of the experience replay mechanism. The knowledge of track performance is introduced to construct heuristic function to guide the unmanned aerial vehicles’ action selection and reduce the useless exploration. Experience replay mechanism based on maximum average reward increases the utilization rate of excellent samples and the convergence speed of the algorithm. The simulation results show that the proposed three-dimensional path planning algorithm has good learning efficiency, and the convergence speed and training performance are significantly improved.
format	Online Article Text
id	pubmed-10453672
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	SAGE Publications
record_format	MEDLINE/PubMed
spelling	pubmed-104536722023-08-26 Heuristic Q-learning based on experience replay for three-dimensional path planning of the unmanned aerial vehicle Xie, Ronglei Meng, Zhijun Zhou, Yaoming Ma, Yunpeng Wu, Zhe Sci Prog Article In order to solve the problem that the existing reinforcement learning algorithm is difficult to converge due to the excessive state space of the three-dimensional path planning of the unmanned aerial vehicle, this article proposes a reinforcement learning algorithm based on the heuristic function and the maximum average reward value of the experience replay mechanism. The knowledge of track performance is introduced to construct heuristic function to guide the unmanned aerial vehicles’ action selection and reduce the useless exploration. Experience replay mechanism based on maximum average reward increases the utilization rate of excellent samples and the convergence speed of the algorithm. The simulation results show that the proposed three-dimensional path planning algorithm has good learning efficiency, and the convergence speed and training performance are significantly improved. SAGE Publications 2019-09-30 /pmc/articles/PMC10453672/ /pubmed/31829875 http://dx.doi.org/10.1177/0036850419879024 Text en © The Author(s) 2020 https://creativecommons.org/licenses/by-nc/4.0/This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle	Article Xie, Ronglei Meng, Zhijun Zhou, Yaoming Ma, Yunpeng Wu, Zhe Heuristic Q-learning based on experience replay for three-dimensional path planning of the unmanned aerial vehicle
title	Heuristic Q-learning based on experience replay for three-dimensional path planning of the unmanned aerial vehicle
title_full	Heuristic Q-learning based on experience replay for three-dimensional path planning of the unmanned aerial vehicle
title_fullStr	Heuristic Q-learning based on experience replay for three-dimensional path planning of the unmanned aerial vehicle
title_full_unstemmed	Heuristic Q-learning based on experience replay for three-dimensional path planning of the unmanned aerial vehicle
title_short	Heuristic Q-learning based on experience replay for three-dimensional path planning of the unmanned aerial vehicle
title_sort	heuristic q-learning based on experience replay for three-dimensional path planning of the unmanned aerial vehicle
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10453672/ https://www.ncbi.nlm.nih.gov/pubmed/31829875 http://dx.doi.org/10.1177/0036850419879024
work_keys_str_mv	AT xieronglei heuristicqlearningbasedonexperiencereplayforthreedimensionalpathplanningoftheunmannedaerialvehicle AT mengzhijun heuristicqlearningbasedonexperiencereplayforthreedimensionalpathplanningoftheunmannedaerialvehicle AT zhouyaoming heuristicqlearningbasedonexperiencereplayforthreedimensionalpathplanningoftheunmannedaerialvehicle AT mayunpeng heuristicqlearningbasedonexperiencereplayforthreedimensionalpathplanningoftheunmannedaerialvehicle AT wuzhe heuristicqlearningbasedonexperiencereplayforthreedimensionalpathplanningoftheunmannedaerialvehicle

Heuristic Q-learning based on experience replay for three-dimensional path planning of the unmanned aerial vehicle

Ejemplares similares