Cargando…

A hierarchical reinforcement learning method for missile evasion and guidance

This paper proposes an algorithm for missile manoeuvring based on a hierarchical proximal policy optimization (PPO) reinforcement learning algorithm, which enables a missile to guide to a target and evade an interceptor at the same time. Based on the idea of task hierarchy, the agent has a two-layer...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yan, Mengda, Yang, Rennong, Zhang, Ying, Yue, Longfei, Hu, Dongyuan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group UK 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9640633/ https://www.ncbi.nlm.nih.gov/pubmed/36344598 http://dx.doi.org/10.1038/s41598-022-21756-6

_version_	1784825899526389760
author	Yan, Mengda Yang, Rennong Zhang, Ying Yue, Longfei Hu, Dongyuan
author_facet	Yan, Mengda Yang, Rennong Zhang, Ying Yue, Longfei Hu, Dongyuan
author_sort	Yan, Mengda
collection	PubMed
description	This paper proposes an algorithm for missile manoeuvring based on a hierarchical proximal policy optimization (PPO) reinforcement learning algorithm, which enables a missile to guide to a target and evade an interceptor at the same time. Based on the idea of task hierarchy, the agent has a two-layer structure, in which low-level agents control basic actions and are controlled by a high-level agent. The low level has two agents called a guidance agent and an evasion agent, which are trained in simple scenarios and embedded in the high-level agent. The high level has a policy selector agent, which chooses one of the low-level agents to activate at each decision moment. The reward functions for each agent are different, considering the guidance accuracy, flight time, and energy consumption metrics, as well as a field-of-view constraint. Simulation shows that the PPO algorithm without a hierarchical structure cannot complete the task, while the hierarchical PPO algorithm has a 100% success rate on a test dataset. The agent shows good adaptability and strong robustness to the second-order lag of autopilot and measurement noises. Compared with a traditional guidance law, the reinforcement learning guidance law has satisfactory guidance accuracy and significant advantages in average time and average energy consumption.
format	Online Article Text
id	pubmed-9640633
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Nature Publishing Group UK
record_format	MEDLINE/PubMed
spelling	pubmed-96406332022-11-15 A hierarchical reinforcement learning method for missile evasion and guidance Yan, Mengda Yang, Rennong Zhang, Ying Yue, Longfei Hu, Dongyuan Sci Rep Article This paper proposes an algorithm for missile manoeuvring based on a hierarchical proximal policy optimization (PPO) reinforcement learning algorithm, which enables a missile to guide to a target and evade an interceptor at the same time. Based on the idea of task hierarchy, the agent has a two-layer structure, in which low-level agents control basic actions and are controlled by a high-level agent. The low level has two agents called a guidance agent and an evasion agent, which are trained in simple scenarios and embedded in the high-level agent. The high level has a policy selector agent, which chooses one of the low-level agents to activate at each decision moment. The reward functions for each agent are different, considering the guidance accuracy, flight time, and energy consumption metrics, as well as a field-of-view constraint. Simulation shows that the PPO algorithm without a hierarchical structure cannot complete the task, while the hierarchical PPO algorithm has a 100% success rate on a test dataset. The agent shows good adaptability and strong robustness to the second-order lag of autopilot and measurement noises. Compared with a traditional guidance law, the reinforcement learning guidance law has satisfactory guidance accuracy and significant advantages in average time and average energy consumption. Nature Publishing Group UK 2022-11-07 /pmc/articles/PMC9640633/ /pubmed/36344598 http://dx.doi.org/10.1038/s41598-022-21756-6 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle	Article Yan, Mengda Yang, Rennong Zhang, Ying Yue, Longfei Hu, Dongyuan A hierarchical reinforcement learning method for missile evasion and guidance
title	A hierarchical reinforcement learning method for missile evasion and guidance
title_full	A hierarchical reinforcement learning method for missile evasion and guidance
title_fullStr	A hierarchical reinforcement learning method for missile evasion and guidance
title_full_unstemmed	A hierarchical reinforcement learning method for missile evasion and guidance
title_short	A hierarchical reinforcement learning method for missile evasion and guidance
title_sort	hierarchical reinforcement learning method for missile evasion and guidance
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9640633/ https://www.ncbi.nlm.nih.gov/pubmed/36344598 http://dx.doi.org/10.1038/s41598-022-21756-6
work_keys_str_mv	AT yanmengda ahierarchicalreinforcementlearningmethodformissileevasionandguidance AT yangrennong ahierarchicalreinforcementlearningmethodformissileevasionandguidance AT zhangying ahierarchicalreinforcementlearningmethodformissileevasionandguidance AT yuelongfei ahierarchicalreinforcementlearningmethodformissileevasionandguidance AT hudongyuan ahierarchicalreinforcementlearningmethodformissileevasionandguidance AT yanmengda hierarchicalreinforcementlearningmethodformissileevasionandguidance AT yangrennong hierarchicalreinforcementlearningmethodformissileevasionandguidance AT zhangying hierarchicalreinforcementlearningmethodformissileevasionandguidance AT yuelongfei hierarchicalreinforcementlearningmethodformissileevasionandguidance AT hudongyuan hierarchicalreinforcementlearningmethodformissileevasionandguidance

A hierarchical reinforcement learning method for missile evasion and guidance

Ejemplares similares