Cargando…

Reinforcement Learning-Based Multi-AUV Adaptive Trajectory Planning for Under-Ice Field Estimation

This work studies online learning-based trajectory planning for multiple autonomous underwater vehicles (AUVs) to estimate a water parameter field of interest in the under-ice environment. A centralized system is considered, where several fixed access points on the ice layer are introduced as gatewa...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wang, Chaofeng, Wei, Li, Wang, Zhaohui, Song, Min, Mahmoudian, Nina
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2018
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6263807/ https://www.ncbi.nlm.nih.gov/pubmed/30424017 http://dx.doi.org/10.3390/s18113859

_version_	1783375362953052160
author	Wang, Chaofeng Wei, Li Wang, Zhaohui Song, Min Mahmoudian, Nina
author_facet	Wang, Chaofeng Wei, Li Wang, Zhaohui Song, Min Mahmoudian, Nina
author_sort	Wang, Chaofeng
collection	PubMed
description	This work studies online learning-based trajectory planning for multiple autonomous underwater vehicles (AUVs) to estimate a water parameter field of interest in the under-ice environment. A centralized system is considered, where several fixed access points on the ice layer are introduced as gateways for communications between the AUVs and a remote data fusion center. We model the water parameter field of interest as a Gaussian process with unknown hyper-parameters. The AUV trajectories for sampling are determined on an epoch-by-epoch basis. At the end of each epoch, the access points relay the observed field samples from all the AUVs to the fusion center, which computes the posterior distribution of the field based on the Gaussian process regression and estimates the field hyper-parameters. The optimal trajectories of all the AUVs in the next epoch are determined to maximize a long-term reward that is defined based on the field uncertainty reduction and the AUV mobility cost, subject to the kinematics constraint, the communication constraint and the sensing area constraint. We formulate the adaptive trajectory planning problem as a Markov decision process (MDP). A reinforcement learning-based online learning algorithm is designed to determine the optimal AUV trajectories in a constrained continuous space. Simulation results show that the proposed learning-based trajectory planning algorithm has performance similar to a benchmark method that assumes perfect knowledge of the field hyper-parameters.
format	Online Article Text
id	pubmed-6263807
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-62638072018-12-12 Reinforcement Learning-Based Multi-AUV Adaptive Trajectory Planning for Under-Ice Field Estimation Wang, Chaofeng Wei, Li Wang, Zhaohui Song, Min Mahmoudian, Nina Sensors (Basel) Article This work studies online learning-based trajectory planning for multiple autonomous underwater vehicles (AUVs) to estimate a water parameter field of interest in the under-ice environment. A centralized system is considered, where several fixed access points on the ice layer are introduced as gateways for communications between the AUVs and a remote data fusion center. We model the water parameter field of interest as a Gaussian process with unknown hyper-parameters. The AUV trajectories for sampling are determined on an epoch-by-epoch basis. At the end of each epoch, the access points relay the observed field samples from all the AUVs to the fusion center, which computes the posterior distribution of the field based on the Gaussian process regression and estimates the field hyper-parameters. The optimal trajectories of all the AUVs in the next epoch are determined to maximize a long-term reward that is defined based on the field uncertainty reduction and the AUV mobility cost, subject to the kinematics constraint, the communication constraint and the sensing area constraint. We formulate the adaptive trajectory planning problem as a Markov decision process (MDP). A reinforcement learning-based online learning algorithm is designed to determine the optimal AUV trajectories in a constrained continuous space. Simulation results show that the proposed learning-based trajectory planning algorithm has performance similar to a benchmark method that assumes perfect knowledge of the field hyper-parameters. MDPI 2018-11-09 /pmc/articles/PMC6263807/ /pubmed/30424017 http://dx.doi.org/10.3390/s18113859 Text en © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Wang, Chaofeng Wei, Li Wang, Zhaohui Song, Min Mahmoudian, Nina Reinforcement Learning-Based Multi-AUV Adaptive Trajectory Planning for Under-Ice Field Estimation
title	Reinforcement Learning-Based Multi-AUV Adaptive Trajectory Planning for Under-Ice Field Estimation
title_full	Reinforcement Learning-Based Multi-AUV Adaptive Trajectory Planning for Under-Ice Field Estimation
title_fullStr	Reinforcement Learning-Based Multi-AUV Adaptive Trajectory Planning for Under-Ice Field Estimation
title_full_unstemmed	Reinforcement Learning-Based Multi-AUV Adaptive Trajectory Planning for Under-Ice Field Estimation
title_short	Reinforcement Learning-Based Multi-AUV Adaptive Trajectory Planning for Under-Ice Field Estimation
title_sort	reinforcement learning-based multi-auv adaptive trajectory planning for under-ice field estimation
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6263807/ https://www.ncbi.nlm.nih.gov/pubmed/30424017 http://dx.doi.org/10.3390/s18113859
work_keys_str_mv	AT wangchaofeng reinforcementlearningbasedmultiauvadaptivetrajectoryplanningforundericefieldestimation AT weili reinforcementlearningbasedmultiauvadaptivetrajectoryplanningforundericefieldestimation AT wangzhaohui reinforcementlearningbasedmultiauvadaptivetrajectoryplanningforundericefieldestimation AT songmin reinforcementlearningbasedmultiauvadaptivetrajectoryplanningforundericefieldestimation AT mahmoudiannina reinforcementlearningbasedmultiauvadaptivetrajectoryplanningforundericefieldestimation

Reinforcement Learning-Based Multi-AUV Adaptive Trajectory Planning for Under-Ice Field Estimation

Ejemplares similares