Cargando…

MW-MADDPG: a meta-learning based decision-making method for collaborative UAV swarm

Unmanned Aerial Vehicles (UAVs) have gained popularity due to their low lifecycle cost and minimal human risk, resulting in their widespread use in recent years. In the UAV swarm cooperative decision domain, multi-agent deep reinforcement learning has significant potential. However, current approach...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhao, Minrui, Wang, Gang, Fu, Qiang, Guo, Xiangke, Chen, Yu, Li, Tengda, Liu, XiangYu
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2023
Materias:	Neuroscience
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10551453/ https://www.ncbi.nlm.nih.gov/pubmed/37811355 http://dx.doi.org/10.3389/fnbot.2023.1243174

_version_	1785115769896435712
author	Zhao, Minrui Wang, Gang Fu, Qiang Guo, Xiangke Chen, Yu Li, Tengda Liu, XiangYu
author_facet	Zhao, Minrui Wang, Gang Fu, Qiang Guo, Xiangke Chen, Yu Li, Tengda Liu, XiangYu
author_sort	Zhao, Minrui
collection	PubMed
description	Unmanned Aerial Vehicles (UAVs) have gained popularity due to their low lifecycle cost and minimal human risk, resulting in their widespread use in recent years. In the UAV swarm cooperative decision domain, multi-agent deep reinforcement learning has significant potential. However, current approaches are challenged by the multivariate mission environment and mission time constraints. In light of this, the present study proposes a meta-learning based multi-agent deep reinforcement learning approach that provides a viable solution to this problem. This paper presents an improved MAML-based multi-agent deep deterministic policy gradient (MADDPG) algorithm that achieves an unbiased initialization network by automatically assigning weights to meta-learning trajectories. In addition, a Reward-TD prioritized experience replay technique is introduced, which takes into account immediate reward and TD-error to improve the resilience and sample utilization of the algorithm. Experiment results show that the proposed approach effectively accomplishes the task in the new scenario, with significantly improved task success rate, average reward, and robustness compared to existing methods.
format	Online Article Text
id	pubmed-10551453
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-105514532023-10-06 MW-MADDPG: a meta-learning based decision-making method for collaborative UAV swarm Zhao, Minrui Wang, Gang Fu, Qiang Guo, Xiangke Chen, Yu Li, Tengda Liu, XiangYu Front Neurorobot Neuroscience Unmanned Aerial Vehicles (UAVs) have gained popularity due to their low lifecycle cost and minimal human risk, resulting in their widespread use in recent years. In the UAV swarm cooperative decision domain, multi-agent deep reinforcement learning has significant potential. However, current approaches are challenged by the multivariate mission environment and mission time constraints. In light of this, the present study proposes a meta-learning based multi-agent deep reinforcement learning approach that provides a viable solution to this problem. This paper presents an improved MAML-based multi-agent deep deterministic policy gradient (MADDPG) algorithm that achieves an unbiased initialization network by automatically assigning weights to meta-learning trajectories. In addition, a Reward-TD prioritized experience replay technique is introduced, which takes into account immediate reward and TD-error to improve the resilience and sample utilization of the algorithm. Experiment results show that the proposed approach effectively accomplishes the task in the new scenario, with significantly improved task success rate, average reward, and robustness compared to existing methods. Frontiers Media S.A. 2023-09-21 /pmc/articles/PMC10551453/ /pubmed/37811355 http://dx.doi.org/10.3389/fnbot.2023.1243174 Text en Copyright © 2023 Zhao, Wang, Fu, Guo, Chen, Li and Liu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Neuroscience Zhao, Minrui Wang, Gang Fu, Qiang Guo, Xiangke Chen, Yu Li, Tengda Liu, XiangYu MW-MADDPG: a meta-learning based decision-making method for collaborative UAV swarm
title	MW-MADDPG: a meta-learning based decision-making method for collaborative UAV swarm
title_full	MW-MADDPG: a meta-learning based decision-making method for collaborative UAV swarm
title_fullStr	MW-MADDPG: a meta-learning based decision-making method for collaborative UAV swarm
title_full_unstemmed	MW-MADDPG: a meta-learning based decision-making method for collaborative UAV swarm
title_short	MW-MADDPG: a meta-learning based decision-making method for collaborative UAV swarm
title_sort	mw-maddpg: a meta-learning based decision-making method for collaborative uav swarm
topic	Neuroscience
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10551453/ https://www.ncbi.nlm.nih.gov/pubmed/37811355 http://dx.doi.org/10.3389/fnbot.2023.1243174
work_keys_str_mv	AT zhaominrui mwmaddpgametalearningbaseddecisionmakingmethodforcollaborativeuavswarm AT wanggang mwmaddpgametalearningbaseddecisionmakingmethodforcollaborativeuavswarm AT fuqiang mwmaddpgametalearningbaseddecisionmakingmethodforcollaborativeuavswarm AT guoxiangke mwmaddpgametalearningbaseddecisionmakingmethodforcollaborativeuavswarm AT chenyu mwmaddpgametalearningbaseddecisionmakingmethodforcollaborativeuavswarm AT litengda mwmaddpgametalearningbaseddecisionmakingmethodforcollaborativeuavswarm AT liuxiangyu mwmaddpgametalearningbaseddecisionmakingmethodforcollaborativeuavswarm

MW-MADDPG: a meta-learning based decision-making method for collaborative UAV swarm

Ejemplares similares