Cargando…

Learning Reward Function with Matching Network for Mapless Navigation

Deep reinforcement learning (DRL) has been successfully applied in mapless navigation. An important issue in DRL is to design a reward function for evaluating actions of agents. However, designing a robust and suitable reward function greatly depends on the designer’s experience and intuition. To ad...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhang, Qichen, Zhu, Meiqiang, Zou, Liang, Li, Ming, Zhang, Yong
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7374413/ https://www.ncbi.nlm.nih.gov/pubmed/32629934 http://dx.doi.org/10.3390/s20133664

_version_	1783561693487431680
author	Zhang, Qichen Zhu, Meiqiang Zou, Liang Li, Ming Zhang, Yong
author_facet	Zhang, Qichen Zhu, Meiqiang Zou, Liang Li, Ming Zhang, Yong
author_sort	Zhang, Qichen
collection	PubMed
description	Deep reinforcement learning (DRL) has been successfully applied in mapless navigation. An important issue in DRL is to design a reward function for evaluating actions of agents. However, designing a robust and suitable reward function greatly depends on the designer’s experience and intuition. To address this concern, we consider employing reward shaping from trajectories on similar navigation tasks without human supervision, and propose a general reward function based on matching network (MN). The MN-based reward function is able to gain the experience by pre-training through trajectories on different navigation tasks and accelerate the training speed of DRL in new tasks. The proposed reward function keeps the optimal strategy of DRL unchanged. The simulation results on two static maps show that the DRL converge with less iterations via the learned reward function than the state-of-the-art mapless navigation methods. The proposed method performs well in dynamic maps with partially moving obstacles. Even when test maps are different from training maps, the proposed strategy is able to complete the navigation tasks without additional training.
format	Online Article Text
id	pubmed-7374413
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-73744132020-08-06 Learning Reward Function with Matching Network for Mapless Navigation Zhang, Qichen Zhu, Meiqiang Zou, Liang Li, Ming Zhang, Yong Sensors (Basel) Article Deep reinforcement learning (DRL) has been successfully applied in mapless navigation. An important issue in DRL is to design a reward function for evaluating actions of agents. However, designing a robust and suitable reward function greatly depends on the designer’s experience and intuition. To address this concern, we consider employing reward shaping from trajectories on similar navigation tasks without human supervision, and propose a general reward function based on matching network (MN). The MN-based reward function is able to gain the experience by pre-training through trajectories on different navigation tasks and accelerate the training speed of DRL in new tasks. The proposed reward function keeps the optimal strategy of DRL unchanged. The simulation results on two static maps show that the DRL converge with less iterations via the learned reward function than the state-of-the-art mapless navigation methods. The proposed method performs well in dynamic maps with partially moving obstacles. Even when test maps are different from training maps, the proposed strategy is able to complete the navigation tasks without additional training. MDPI 2020-06-30 /pmc/articles/PMC7374413/ /pubmed/32629934 http://dx.doi.org/10.3390/s20133664 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Zhang, Qichen Zhu, Meiqiang Zou, Liang Li, Ming Zhang, Yong Learning Reward Function with Matching Network for Mapless Navigation
title	Learning Reward Function with Matching Network for Mapless Navigation
title_full	Learning Reward Function with Matching Network for Mapless Navigation
title_fullStr	Learning Reward Function with Matching Network for Mapless Navigation
title_full_unstemmed	Learning Reward Function with Matching Network for Mapless Navigation
title_short	Learning Reward Function with Matching Network for Mapless Navigation
title_sort	learning reward function with matching network for mapless navigation
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7374413/ https://www.ncbi.nlm.nih.gov/pubmed/32629934 http://dx.doi.org/10.3390/s20133664
work_keys_str_mv	AT zhangqichen learningrewardfunctionwithmatchingnetworkformaplessnavigation AT zhumeiqiang learningrewardfunctionwithmatchingnetworkformaplessnavigation AT zouliang learningrewardfunctionwithmatchingnetworkformaplessnavigation AT liming learningrewardfunctionwithmatchingnetworkformaplessnavigation AT zhangyong learningrewardfunctionwithmatchingnetworkformaplessnavigation

Learning Reward Function with Matching Network for Mapless Navigation

Ejemplares similares