Cargando…
Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning
Recent years have seen the application of deep reinforcement learning techniques to cooperative multi-agent systems, with great empirical success. However, given the lack of theoretical insight, it remains unclear what the employed neural networks are learning, or how we should enhance their learnin...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer US
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8550438/ https://www.ncbi.nlm.nih.gov/pubmed/34720685 http://dx.doi.org/10.1007/s10458-021-09506-w |
_version_ | 1784590960386113536 |
---|---|
author | Castellini, Jacopo Oliehoek, Frans A. Savani, Rahul Whiteson, Shimon |
author_facet | Castellini, Jacopo Oliehoek, Frans A. Savani, Rahul Whiteson, Shimon |
author_sort | Castellini, Jacopo |
collection | PubMed |
description | Recent years have seen the application of deep reinforcement learning techniques to cooperative multi-agent systems, with great empirical success. However, given the lack of theoretical insight, it remains unclear what the employed neural networks are learning, or how we should enhance their learning power to address the problems on which they fail. In this work, we empirically investigate the learning power of various network architectures on a series of one-shot games. Despite their simplicity, these games capture many of the crucial problems that arise in the multi-agent setting, such as an exponential number of joint actions or the lack of an explicit coordination mechanism. Our results extend those in Castellini et al. (Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, AAMAS’19.International Foundation for Autonomous Agents and Multiagent Systems, pp 1862–1864, 2019) and quantify how well various approaches can represent the requisite value functions, and help us identify the reasons that can impede good performance, like sparsity of the values or too tight coordination requirements. |
format | Online Article Text |
id | pubmed-8550438 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Springer US |
record_format | MEDLINE/PubMed |
spelling | pubmed-85504382021-10-29 Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning Castellini, Jacopo Oliehoek, Frans A. Savani, Rahul Whiteson, Shimon Auton Agent Multi Agent Syst Article Recent years have seen the application of deep reinforcement learning techniques to cooperative multi-agent systems, with great empirical success. However, given the lack of theoretical insight, it remains unclear what the employed neural networks are learning, or how we should enhance their learning power to address the problems on which they fail. In this work, we empirically investigate the learning power of various network architectures on a series of one-shot games. Despite their simplicity, these games capture many of the crucial problems that arise in the multi-agent setting, such as an exponential number of joint actions or the lack of an explicit coordination mechanism. Our results extend those in Castellini et al. (Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, AAMAS’19.International Foundation for Autonomous Agents and Multiagent Systems, pp 1862–1864, 2019) and quantify how well various approaches can represent the requisite value functions, and help us identify the reasons that can impede good performance, like sparsity of the values or too tight coordination requirements. Springer US 2021-06-07 2021 /pmc/articles/PMC8550438/ /pubmed/34720685 http://dx.doi.org/10.1007/s10458-021-09506-w Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Castellini, Jacopo Oliehoek, Frans A. Savani, Rahul Whiteson, Shimon Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning |
title | Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning |
title_full | Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning |
title_fullStr | Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning |
title_full_unstemmed | Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning |
title_short | Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning |
title_sort | analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8550438/ https://www.ncbi.nlm.nih.gov/pubmed/34720685 http://dx.doi.org/10.1007/s10458-021-09506-w |
work_keys_str_mv | AT castellinijacopo analysingfactorizationsofactionvaluenetworksforcooperativemultiagentreinforcementlearning AT oliehoekfransa analysingfactorizationsofactionvaluenetworksforcooperativemultiagentreinforcementlearning AT savanirahul analysingfactorizationsofactionvaluenetworksforcooperativemultiagentreinforcementlearning AT whitesonshimon analysingfactorizationsofactionvaluenetworksforcooperativemultiagentreinforcementlearning |