Cargando…

Value Iteration Networks with Double Estimator for Planetary Rover Path Planning

Path planning technology is significant for planetary rovers that perform exploration missions in unfamiliar environments. In this work, we propose a novel global path planning algorithm, based on the value iteration network (VIN), which is embedded within a differentiable planning module, built on...

Descripción completa

Detalles Bibliográficos
Autores principales:	Jin, Xiang, Lan, Wei, Wang, Tianlin, Yu, Pengyao
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8709000/ https://www.ncbi.nlm.nih.gov/pubmed/34960508 http://dx.doi.org/10.3390/s21248418

_version_	1784622825699540992
author	Jin, Xiang Lan, Wei Wang, Tianlin Yu, Pengyao
author_facet	Jin, Xiang Lan, Wei Wang, Tianlin Yu, Pengyao
author_sort	Jin, Xiang
collection	PubMed
description	Path planning technology is significant for planetary rovers that perform exploration missions in unfamiliar environments. In this work, we propose a novel global path planning algorithm, based on the value iteration network (VIN), which is embedded within a differentiable planning module, built on the value iteration (VI) algorithm, and has emerged as an effective method to learn to plan. Despite the capability of learning environment dynamics and performing long-range reasoning, the VIN suffers from several limitations, including sensitivity to initialization and poor performance in large-scale domains. We introduce the double value iteration network (dVIN), which decouples action selection and value estimation in the VI module, using the weighted double estimator method to approximate the maximum expected value, instead of maximizing over the estimated action value. We have devised a simple, yet effective, two-stage training strategy for VI-based models to address the problem of high computational cost and poor performance in large-size domains. We evaluate the dVIN on planning problems in grid-world domains and realistic datasets, generated from terrain images of a moon landscape. We show that our dVIN empirically outperforms the baseline methods and generalize better to large-scale environments.
format	Online Article Text
id	pubmed-8709000
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-87090002021-12-25 Value Iteration Networks with Double Estimator for Planetary Rover Path Planning Jin, Xiang Lan, Wei Wang, Tianlin Yu, Pengyao Sensors (Basel) Article Path planning technology is significant for planetary rovers that perform exploration missions in unfamiliar environments. In this work, we propose a novel global path planning algorithm, based on the value iteration network (VIN), which is embedded within a differentiable planning module, built on the value iteration (VI) algorithm, and has emerged as an effective method to learn to plan. Despite the capability of learning environment dynamics and performing long-range reasoning, the VIN suffers from several limitations, including sensitivity to initialization and poor performance in large-scale domains. We introduce the double value iteration network (dVIN), which decouples action selection and value estimation in the VI module, using the weighted double estimator method to approximate the maximum expected value, instead of maximizing over the estimated action value. We have devised a simple, yet effective, two-stage training strategy for VI-based models to address the problem of high computational cost and poor performance in large-size domains. We evaluate the dVIN on planning problems in grid-world domains and realistic datasets, generated from terrain images of a moon landscape. We show that our dVIN empirically outperforms the baseline methods and generalize better to large-scale environments. MDPI 2021-12-16 /pmc/articles/PMC8709000/ /pubmed/34960508 http://dx.doi.org/10.3390/s21248418 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Jin, Xiang Lan, Wei Wang, Tianlin Yu, Pengyao Value Iteration Networks with Double Estimator for Planetary Rover Path Planning
title	Value Iteration Networks with Double Estimator for Planetary Rover Path Planning
title_full	Value Iteration Networks with Double Estimator for Planetary Rover Path Planning
title_fullStr	Value Iteration Networks with Double Estimator for Planetary Rover Path Planning
title_full_unstemmed	Value Iteration Networks with Double Estimator for Planetary Rover Path Planning
title_short	Value Iteration Networks with Double Estimator for Planetary Rover Path Planning
title_sort	value iteration networks with double estimator for planetary rover path planning
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8709000/ https://www.ncbi.nlm.nih.gov/pubmed/34960508 http://dx.doi.org/10.3390/s21248418
work_keys_str_mv	AT jinxiang valueiterationnetworkswithdoubleestimatorforplanetaryroverpathplanning AT lanwei valueiterationnetworkswithdoubleestimatorforplanetaryroverpathplanning AT wangtianlin valueiterationnetworkswithdoubleestimatorforplanetaryroverpathplanning AT yupengyao valueiterationnetworkswithdoubleestimatorforplanetaryroverpathplanning

Value Iteration Networks with Double Estimator for Planetary Rover Path Planning

Ejemplares similares