Cargando…

Dynamical systems as a level of cognitive analysis of multi-agent learning: Algorithmic foundations of temporal-difference learning dynamics

A dynamical systems perspective on multi-agent learning, based on the link between evolutionary game theory and reinforcement learning, provides an improved, qualitative understanding of the emerging collective learning dynamics. However, confusion exists with respect to how this dynamical systems a...

Descripción completa

Detalles Bibliográficos
Autor principal:	Barfuss, Wolfram
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer London 2021
Materias:	S.I. : Adaptive and Learning Agents 2020
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8827307/ https://www.ncbi.nlm.nih.gov/pubmed/35221541 http://dx.doi.org/10.1007/s00521-021-06117-0

_version_	1784647602300518400
author	Barfuss, Wolfram
author_facet	Barfuss, Wolfram
author_sort	Barfuss, Wolfram
collection	PubMed
description	A dynamical systems perspective on multi-agent learning, based on the link between evolutionary game theory and reinforcement learning, provides an improved, qualitative understanding of the emerging collective learning dynamics. However, confusion exists with respect to how this dynamical systems account of multi-agent learning should be interpreted. In this article, I propose to embed the dynamical systems description of multi-agent learning into different abstraction levels of cognitive analysis. The purpose of this work is to make the connections between these levels explicit in order to gain improved insight into multi-agent learning. I demonstrate the usefulness of this framework with the general and widespread class of temporal-difference reinforcement learning. I find that its deterministic dynamical systems description follows a minimum free-energy principle and unifies a boundedly rational account of game theory with decision-making under uncertainty. I then propose an on-line sample-batch temporal-difference algorithm which is characterized by the combination of applying a memory-batch and separated state-action value estimation. I find that this algorithm serves as a micro-foundation of the deterministic learning equations by showing that its learning trajectories approach the ones of the deterministic learning equations under large batch sizes. Ultimately, this framework of embedding a dynamical systems description into different abstraction levels gives guidance on how to unleash the full potential of the dynamical systems approach to multi-agent learning.
format	Online Article Text
id	pubmed-8827307
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Springer London
record_format	MEDLINE/PubMed
spelling	pubmed-88273072022-02-23 Dynamical systems as a level of cognitive analysis of multi-agent learning: Algorithmic foundations of temporal-difference learning dynamics Barfuss, Wolfram Neural Comput Appl S.I. : Adaptive and Learning Agents 2020 A dynamical systems perspective on multi-agent learning, based on the link between evolutionary game theory and reinforcement learning, provides an improved, qualitative understanding of the emerging collective learning dynamics. However, confusion exists with respect to how this dynamical systems account of multi-agent learning should be interpreted. In this article, I propose to embed the dynamical systems description of multi-agent learning into different abstraction levels of cognitive analysis. The purpose of this work is to make the connections between these levels explicit in order to gain improved insight into multi-agent learning. I demonstrate the usefulness of this framework with the general and widespread class of temporal-difference reinforcement learning. I find that its deterministic dynamical systems description follows a minimum free-energy principle and unifies a boundedly rational account of game theory with decision-making under uncertainty. I then propose an on-line sample-batch temporal-difference algorithm which is characterized by the combination of applying a memory-batch and separated state-action value estimation. I find that this algorithm serves as a micro-foundation of the deterministic learning equations by showing that its learning trajectories approach the ones of the deterministic learning equations under large batch sizes. Ultimately, this framework of embedding a dynamical systems description into different abstraction levels gives guidance on how to unleash the full potential of the dynamical systems approach to multi-agent learning. Springer London 2021-06-23 2022 /pmc/articles/PMC8827307/ /pubmed/35221541 http://dx.doi.org/10.1007/s00521-021-06117-0 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle	S.I. : Adaptive and Learning Agents 2020 Barfuss, Wolfram Dynamical systems as a level of cognitive analysis of multi-agent learning: Algorithmic foundations of temporal-difference learning dynamics
title	Dynamical systems as a level of cognitive analysis of multi-agent learning: Algorithmic foundations of temporal-difference learning dynamics
title_full	Dynamical systems as a level of cognitive analysis of multi-agent learning: Algorithmic foundations of temporal-difference learning dynamics
title_fullStr	Dynamical systems as a level of cognitive analysis of multi-agent learning: Algorithmic foundations of temporal-difference learning dynamics
title_full_unstemmed	Dynamical systems as a level of cognitive analysis of multi-agent learning: Algorithmic foundations of temporal-difference learning dynamics
title_short	Dynamical systems as a level of cognitive analysis of multi-agent learning: Algorithmic foundations of temporal-difference learning dynamics
title_sort	dynamical systems as a level of cognitive analysis of multi-agent learning: algorithmic foundations of temporal-difference learning dynamics
topic	S.I. : Adaptive and Learning Agents 2020
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8827307/ https://www.ncbi.nlm.nih.gov/pubmed/35221541 http://dx.doi.org/10.1007/s00521-021-06117-0
work_keys_str_mv	AT barfusswolfram dynamicalsystemsasalevelofcognitiveanalysisofmultiagentlearningalgorithmicfoundationsoftemporaldifferencelearningdynamics

Dynamical systems as a level of cognitive analysis of multi-agent learning: Algorithmic foundations of temporal-difference learning dynamics

Ejemplares similares