Cargando…

A novel Q-learning algorithm based on improved whale optimization algorithm for path planning

Q-learning is a classical reinforcement learning algorithm and one of the most important methods of mobile robot path planning without a prior environmental model. Nevertheless, Q-learning is too simple when initializing Q-table and wastes too much time in the exploration process, causing a slow con...

Descripción completa

Detalles Bibliográficos
Autores principales:	Li, Ying, Wang, Hanyu, Fan, Jiahao, Geng, Yanyu
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2022
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9794100/ https://www.ncbi.nlm.nih.gov/pubmed/36574399 http://dx.doi.org/10.1371/journal.pone.0279438

_version_	1784859967249973248
author	Li, Ying Wang, Hanyu Fan, Jiahao Geng, Yanyu
author_facet	Li, Ying Wang, Hanyu Fan, Jiahao Geng, Yanyu
author_sort	Li, Ying
collection	PubMed
description	Q-learning is a classical reinforcement learning algorithm and one of the most important methods of mobile robot path planning without a prior environmental model. Nevertheless, Q-learning is too simple when initializing Q-table and wastes too much time in the exploration process, causing a slow convergence speed. This paper proposes a new Q-learning algorithm called the Paired Whale Optimization Q-learning Algorithm (PWOQLA) which includes four improvements. Firstly, to accelerate the convergence speed of Q-learning, a whale optimization algorithm is used to initialize the values of a Q-table. Before the exploration process, a Q-table which contains previous experience is learned to improve algorithm efficiency. Secondly, to improve the local exploitation capability of the whale optimization algorithm, a paired whale optimization algorithm is proposed in combination with a pairing strategy to speed up the search for prey. Thirdly, to improve the exploration efficiency of Q-learning and reduce the number of useless explorations, a new selective exploration strategy is introduced which considers the relationship between current position and target position. Fourthly, in order to balance the exploration and exploitation capabilities of Q-learning so that it focuses on exploration in the early stage and on exploitation in the later stage, a nonlinear function is designed which changes the value of ε in ε-greedy Q-learning dynamically based on the number of iterations. Comparing the performance of PWOQLA with other path planning algorithms, experimental results demonstrate that PWOQLA achieves a higher level of accuracy and a faster convergence speed than existing counterparts in mobile robot path planning. The code will be released at https://github.com/wanghanyu0526/improveQL.git.
format	Online Article Text
id	pubmed-9794100
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-97941002022-12-28 A novel Q-learning algorithm based on improved whale optimization algorithm for path planning Li, Ying Wang, Hanyu Fan, Jiahao Geng, Yanyu PLoS One Research Article Q-learning is a classical reinforcement learning algorithm and one of the most important methods of mobile robot path planning without a prior environmental model. Nevertheless, Q-learning is too simple when initializing Q-table and wastes too much time in the exploration process, causing a slow convergence speed. This paper proposes a new Q-learning algorithm called the Paired Whale Optimization Q-learning Algorithm (PWOQLA) which includes four improvements. Firstly, to accelerate the convergence speed of Q-learning, a whale optimization algorithm is used to initialize the values of a Q-table. Before the exploration process, a Q-table which contains previous experience is learned to improve algorithm efficiency. Secondly, to improve the local exploitation capability of the whale optimization algorithm, a paired whale optimization algorithm is proposed in combination with a pairing strategy to speed up the search for prey. Thirdly, to improve the exploration efficiency of Q-learning and reduce the number of useless explorations, a new selective exploration strategy is introduced which considers the relationship between current position and target position. Fourthly, in order to balance the exploration and exploitation capabilities of Q-learning so that it focuses on exploration in the early stage and on exploitation in the later stage, a nonlinear function is designed which changes the value of ε in ε-greedy Q-learning dynamically based on the number of iterations. Comparing the performance of PWOQLA with other path planning algorithms, experimental results demonstrate that PWOQLA achieves a higher level of accuracy and a faster convergence speed than existing counterparts in mobile robot path planning. The code will be released at https://github.com/wanghanyu0526/improveQL.git. Public Library of Science 2022-12-27 /pmc/articles/PMC9794100/ /pubmed/36574399 http://dx.doi.org/10.1371/journal.pone.0279438 Text en © 2022 Li et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Li, Ying Wang, Hanyu Fan, Jiahao Geng, Yanyu A novel Q-learning algorithm based on improved whale optimization algorithm for path planning
title	A novel Q-learning algorithm based on improved whale optimization algorithm for path planning
title_full	A novel Q-learning algorithm based on improved whale optimization algorithm for path planning
title_fullStr	A novel Q-learning algorithm based on improved whale optimization algorithm for path planning
title_full_unstemmed	A novel Q-learning algorithm based on improved whale optimization algorithm for path planning
title_short	A novel Q-learning algorithm based on improved whale optimization algorithm for path planning
title_sort	novel q-learning algorithm based on improved whale optimization algorithm for path planning
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9794100/ https://www.ncbi.nlm.nih.gov/pubmed/36574399 http://dx.doi.org/10.1371/journal.pone.0279438
work_keys_str_mv	AT liying anovelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning AT wanghanyu anovelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning AT fanjiahao anovelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning AT gengyanyu anovelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning AT liying novelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning AT wanghanyu novelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning AT fanjiahao novelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning AT gengyanyu novelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning

A novel Q-learning algorithm based on improved whale optimization algorithm for path planning

Ejemplares similares