Cargando…
A novel Q-learning algorithm based on improved whale optimization algorithm for path planning
Q-learning is a classical reinforcement learning algorithm and one of the most important methods of mobile robot path planning without a prior environmental model. Nevertheless, Q-learning is too simple when initializing Q-table and wastes too much time in the exploration process, causing a slow con...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9794100/ https://www.ncbi.nlm.nih.gov/pubmed/36574399 http://dx.doi.org/10.1371/journal.pone.0279438 |
_version_ | 1784859967249973248 |
---|---|
author | Li, Ying Wang, Hanyu Fan, Jiahao Geng, Yanyu |
author_facet | Li, Ying Wang, Hanyu Fan, Jiahao Geng, Yanyu |
author_sort | Li, Ying |
collection | PubMed |
description | Q-learning is a classical reinforcement learning algorithm and one of the most important methods of mobile robot path planning without a prior environmental model. Nevertheless, Q-learning is too simple when initializing Q-table and wastes too much time in the exploration process, causing a slow convergence speed. This paper proposes a new Q-learning algorithm called the Paired Whale Optimization Q-learning Algorithm (PWOQLA) which includes four improvements. Firstly, to accelerate the convergence speed of Q-learning, a whale optimization algorithm is used to initialize the values of a Q-table. Before the exploration process, a Q-table which contains previous experience is learned to improve algorithm efficiency. Secondly, to improve the local exploitation capability of the whale optimization algorithm, a paired whale optimization algorithm is proposed in combination with a pairing strategy to speed up the search for prey. Thirdly, to improve the exploration efficiency of Q-learning and reduce the number of useless explorations, a new selective exploration strategy is introduced which considers the relationship between current position and target position. Fourthly, in order to balance the exploration and exploitation capabilities of Q-learning so that it focuses on exploration in the early stage and on exploitation in the later stage, a nonlinear function is designed which changes the value of ε in ε-greedy Q-learning dynamically based on the number of iterations. Comparing the performance of PWOQLA with other path planning algorithms, experimental results demonstrate that PWOQLA achieves a higher level of accuracy and a faster convergence speed than existing counterparts in mobile robot path planning. The code will be released at https://github.com/wanghanyu0526/improveQL.git. |
format | Online Article Text |
id | pubmed-9794100 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-97941002022-12-28 A novel Q-learning algorithm based on improved whale optimization algorithm for path planning Li, Ying Wang, Hanyu Fan, Jiahao Geng, Yanyu PLoS One Research Article Q-learning is a classical reinforcement learning algorithm and one of the most important methods of mobile robot path planning without a prior environmental model. Nevertheless, Q-learning is too simple when initializing Q-table and wastes too much time in the exploration process, causing a slow convergence speed. This paper proposes a new Q-learning algorithm called the Paired Whale Optimization Q-learning Algorithm (PWOQLA) which includes four improvements. Firstly, to accelerate the convergence speed of Q-learning, a whale optimization algorithm is used to initialize the values of a Q-table. Before the exploration process, a Q-table which contains previous experience is learned to improve algorithm efficiency. Secondly, to improve the local exploitation capability of the whale optimization algorithm, a paired whale optimization algorithm is proposed in combination with a pairing strategy to speed up the search for prey. Thirdly, to improve the exploration efficiency of Q-learning and reduce the number of useless explorations, a new selective exploration strategy is introduced which considers the relationship between current position and target position. Fourthly, in order to balance the exploration and exploitation capabilities of Q-learning so that it focuses on exploration in the early stage and on exploitation in the later stage, a nonlinear function is designed which changes the value of ε in ε-greedy Q-learning dynamically based on the number of iterations. Comparing the performance of PWOQLA with other path planning algorithms, experimental results demonstrate that PWOQLA achieves a higher level of accuracy and a faster convergence speed than existing counterparts in mobile robot path planning. The code will be released at https://github.com/wanghanyu0526/improveQL.git. Public Library of Science 2022-12-27 /pmc/articles/PMC9794100/ /pubmed/36574399 http://dx.doi.org/10.1371/journal.pone.0279438 Text en © 2022 Li et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Li, Ying Wang, Hanyu Fan, Jiahao Geng, Yanyu A novel Q-learning algorithm based on improved whale optimization algorithm for path planning |
title | A novel Q-learning algorithm based on improved whale optimization algorithm for path planning |
title_full | A novel Q-learning algorithm based on improved whale optimization algorithm for path planning |
title_fullStr | A novel Q-learning algorithm based on improved whale optimization algorithm for path planning |
title_full_unstemmed | A novel Q-learning algorithm based on improved whale optimization algorithm for path planning |
title_short | A novel Q-learning algorithm based on improved whale optimization algorithm for path planning |
title_sort | novel q-learning algorithm based on improved whale optimization algorithm for path planning |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9794100/ https://www.ncbi.nlm.nih.gov/pubmed/36574399 http://dx.doi.org/10.1371/journal.pone.0279438 |
work_keys_str_mv | AT liying anovelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning AT wanghanyu anovelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning AT fanjiahao anovelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning AT gengyanyu anovelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning AT liying novelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning AT wanghanyu novelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning AT fanjiahao novelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning AT gengyanyu novelqlearningalgorithmbasedonimprovedwhaleoptimizationalgorithmforpathplanning |