Cargando…

A novel Q-learning algorithm based on improved whale optimization algorithm for path planning

Q-learning is a classical reinforcement learning algorithm and one of the most important methods of mobile robot path planning without a prior environmental model. Nevertheless, Q-learning is too simple when initializing Q-table and wastes too much time in the exploration process, causing a slow con...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Ying, Wang, Hanyu, Fan, Jiahao, Geng, Yanyu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9794100/
https://www.ncbi.nlm.nih.gov/pubmed/36574399
http://dx.doi.org/10.1371/journal.pone.0279438
_version_ 1784859967249973248
author Li, Ying
Wang, Hanyu
Fan, Jiahao
Geng, Yanyu
author_facet Li, Ying
Wang, Hanyu
Fan, Jiahao
Geng, Yanyu
author_sort Li, Ying
collection PubMed
description Q-learning is a classical reinforcement learning algorithm and one of the most important methods of mobile robot path planning without a prior environmental model. Nevertheless, Q-learning is too simple when initializing Q-table and wastes too much time in the exploration process, causing a slow convergence speed. This paper proposes a new Q-learning algorithm called the Paired Whale Optimization Q-learning Algorithm (PWOQLA) which includes four improvements. Firstly, to accelerate the convergence speed of Q-learning, a whale optimization algorithm is used to initialize the values of a Q-table. Before the exploration process, a Q-table which contains previous experience is learned to improve algorithm efficiency. Secondly, to improve the local exploitation capability of the whale optimization algorithm, a paired whale optimization algorithm is proposed in combination with a pairing strategy to speed up the search for prey. Thirdly, to improve the exploration efficiency of Q-learning and reduce the number of useless explorations, a new selective exploration strategy is introduced which considers the relationship between current position and target position. Fourthly, in order to balance the exploration and exploitation capabilities of Q-learning so that it focuses on exploration in the early stage and on exploitation in the later stage, a nonlinear function is designed which changes the value of ε in ε-greedy Q-learning dynamically based on the number of iterations. Comparing the performance of PWOQLA with other path planning algorithms, experimental results demonstrate that PWOQLA achieves a higher level of accuracy and a faster convergence speed than existing counterparts in mobile robot path planning. The code will be released at https://github.com/wanghanyu0526/improveQL.git.
format Online
Article
Text
id pubmed-9794100
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-97941002022-12-28 A novel Q-learning algorithm based on improved whale optimization algorithm for path planning Li, Ying Wang, Hanyu Fan, Jiahao Geng, Yanyu PLoS One Research Article Q-learning is a classical reinforcement learning algorithm and one of the most important methods of mobile robot path planning without a prior environmental model. Nevertheless, Q-learning is too simple when initializing Q-table and wastes too much time in the exploration process, causing a slow convergence speed. This paper proposes a new Q-learning algorithm called the Paired Whale Optimization Q-learning Algorithm (PWOQLA) which includes four improvements. Firstly, to accelerate the convergence speed of Q-learning, a whale optimization algorithm is used to initialize the values of a Q-table. Before the exploration process, a Q-table which contains previous experience is learned to improve algorithm efficiency. Secondly, to improve the local exploitation capability of the whale optimization algorithm, a paired whale optimization algorithm is proposed in combination with a pairing strategy to speed up the search for prey. Thirdly, to improve the exploration efficiency of Q-learning and reduce the number of useless explorations, a new selective exploration strategy is introduced which considers the relationship between current position and target position. Fourthly, in order to balance the exploration and exploitation capabilities of Q-learning so that it focuses on exploration in the early stage and on exploitation in the later stage, a nonlinear function is designed which changes the value of ε in ε-greedy Q-learning dynamically based on the number of iterations. Comparing the performance of PWOQLA with other path planning algorithms, experimental results demonstrate that PWOQLA achieves a higher level of accuracy and a faster convergence speed than existing counterparts in mobile robot path planning. The code will be released at https://github.com/wanghanyu0526/improveQL.git. Public Library of Science 2022-12-27 /pmc/articles/PMC9794100/ /pubmed/36574399 http://dx.doi.org/10.1371/journal.pone.0279438 Text en © 2022 Li et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Li, Ying
Wang, Hanyu
Fan, Jiahao
Geng, Yanyu
A novel Q-learning algorithm based on improved whale optimization algorithm for path planning
title A novel Q-learning algorithm based on improved whale optimization algorithm for path planning
title_full A novel Q-learning algorithm based on improved whale optimization algorithm for path planning
title_fullStr A novel Q-learning algorithm based on improved whale optimization algorithm for path planning
title_full_unstemmed A novel Q-learning algorithm based on improved whale optimization algorithm for path planning
title_short A novel Q-learning algorithm based on improved whale optimization algorithm for path planning
title_sort novel q-learning algorithm based on improved whale optimization algorithm for path planning
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9794100/
https://www.ncbi.nlm.nih.gov/pubmed/36574399
http://dx.doi.org/10.1371/journal.pone.0279438
work_keys_str_mv AT liying anovelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning
AT wanghanyu anovelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning
AT fanjiahao anovelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning
AT gengyanyu anovelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning
AT liying novelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning
AT wanghanyu novelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning
AT fanjiahao novelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning
AT gengyanyu novelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning