Cargando…

Learning Macromanagement in Starcraft by Deep Reinforcement Learning

StarCraft is a real-time strategy game that provides a complex environment for AI research. Macromanagement, i.e., selecting appropriate units to build depending on the current state, is one of the most important problems in this game. To reduce the requirements for expert knowledge and enhance the...

Descripción completa

Detalles Bibliográficos
Autores principales:	Huang, Wenzhen, Yin, Qiyue, Zhang, Junge, Huang, Kaiqi
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8150573/ https://www.ncbi.nlm.nih.gov/pubmed/34065012 http://dx.doi.org/10.3390/s21103332

_version_	1783698180395761664
author	Huang, Wenzhen Yin, Qiyue Zhang, Junge Huang, Kaiqi
author_facet	Huang, Wenzhen Yin, Qiyue Zhang, Junge Huang, Kaiqi
author_sort	Huang, Wenzhen
collection	PubMed
description	StarCraft is a real-time strategy game that provides a complex environment for AI research. Macromanagement, i.e., selecting appropriate units to build depending on the current state, is one of the most important problems in this game. To reduce the requirements for expert knowledge and enhance the coordination of the systematic bot, we select reinforcement learning (RL) to tackle the problem of macromanagement. We propose a novel deep RL method, Mean Asynchronous Advantage Actor-Critic (MA3C), which computes the approximate expected policy gradient instead of the gradient of sampled action to reduce the variance of the gradient, and encode the history queue with recurrent neural network to tackle the problem of imperfect information. The experimental results show that MA3C achieves a very high rate of winning, approximately 90%, against the weaker opponents and it improves the win rate about 30% against the stronger opponents. We also propose a novel method to visualize and interpret the policy learned by MA3C. Combined with the visualized results and the snapshots of games, we find that the learned macromanagement not only adapts to the game rules and the policy of the opponent bot, but also cooperates well with the other modules of MA3C-Bot.
format	Online Article Text
id	pubmed-8150573
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-81505732021-05-27 Learning Macromanagement in Starcraft by Deep Reinforcement Learning Huang, Wenzhen Yin, Qiyue Zhang, Junge Huang, Kaiqi Sensors (Basel) Article StarCraft is a real-time strategy game that provides a complex environment for AI research. Macromanagement, i.e., selecting appropriate units to build depending on the current state, is one of the most important problems in this game. To reduce the requirements for expert knowledge and enhance the coordination of the systematic bot, we select reinforcement learning (RL) to tackle the problem of macromanagement. We propose a novel deep RL method, Mean Asynchronous Advantage Actor-Critic (MA3C), which computes the approximate expected policy gradient instead of the gradient of sampled action to reduce the variance of the gradient, and encode the history queue with recurrent neural network to tackle the problem of imperfect information. The experimental results show that MA3C achieves a very high rate of winning, approximately 90%, against the weaker opponents and it improves the win rate about 30% against the stronger opponents. We also propose a novel method to visualize and interpret the policy learned by MA3C. Combined with the visualized results and the snapshots of games, we find that the learned macromanagement not only adapts to the game rules and the policy of the opponent bot, but also cooperates well with the other modules of MA3C-Bot. MDPI 2021-05-11 /pmc/articles/PMC8150573/ /pubmed/34065012 http://dx.doi.org/10.3390/s21103332 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Huang, Wenzhen Yin, Qiyue Zhang, Junge Huang, Kaiqi Learning Macromanagement in Starcraft by Deep Reinforcement Learning
title	Learning Macromanagement in Starcraft by Deep Reinforcement Learning
title_full	Learning Macromanagement in Starcraft by Deep Reinforcement Learning
title_fullStr	Learning Macromanagement in Starcraft by Deep Reinforcement Learning
title_full_unstemmed	Learning Macromanagement in Starcraft by Deep Reinforcement Learning
title_short	Learning Macromanagement in Starcraft by Deep Reinforcement Learning
title_sort	learning macromanagement in starcraft by deep reinforcement learning
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8150573/ https://www.ncbi.nlm.nih.gov/pubmed/34065012 http://dx.doi.org/10.3390/s21103332
work_keys_str_mv	AT huangwenzhen learningmacromanagementinstarcraftbydeepreinforcementlearning AT yinqiyue learningmacromanagementinstarcraftbydeepreinforcementlearning AT zhangjunge learningmacromanagementinstarcraftbydeepreinforcementlearning AT huangkaiqi learningmacromanagementinstarcraftbydeepreinforcementlearning

Learning Macromanagement in Starcraft by Deep Reinforcement Learning

Ejemplares similares