Cargando…

Mixture of personality improved spiking actor network for efficient multi-agent cooperation

Adaptive multi-agent cooperation with especially unseen partners is becoming more challenging in multi-agent reinforcement learning (MARL) research, whereby conventional deep-learning-based algorithms suffer from the poor new-player-generalization problem, possibly caused by not considering theory-o...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Xiyun, Ni, Ziyi, Ruan, Jingqing, Meng, Linghui, Shi, Jing, Zhang, Tielin, Xu, Bo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10361619/
https://www.ncbi.nlm.nih.gov/pubmed/37483340
http://dx.doi.org/10.3389/fnins.2023.1219405
_version_ 1785076256975355904
author Li, Xiyun
Ni, Ziyi
Ruan, Jingqing
Meng, Linghui
Shi, Jing
Zhang, Tielin
Xu, Bo
author_facet Li, Xiyun
Ni, Ziyi
Ruan, Jingqing
Meng, Linghui
Shi, Jing
Zhang, Tielin
Xu, Bo
author_sort Li, Xiyun
collection PubMed
description Adaptive multi-agent cooperation with especially unseen partners is becoming more challenging in multi-agent reinforcement learning (MARL) research, whereby conventional deep-learning-based algorithms suffer from the poor new-player-generalization problem, possibly caused by not considering theory-of-mind theory (ToM). Inspired by the ToM personality in cognitive psychology, where a human can easily resolve this problem by predicting others' intuitive personality first before complex actions, we propose a biologically-plausible algorithm named the mixture of personality (MoP) improved spiking actor network (SAN). The MoP module contains a determinantal point process to simulate the formation and integration of different personality types, and the SAN module contains spiking neurons for efficient reinforcement learning. The experimental results on the benchmark cooperative overcooked task showed that the proposed MoP-SAN algorithm could achieve higher performance for the paradigms with (learning) and without (generalization) unseen partners. Furthermore, ablation experiments highlighted the contribution of MoP in SAN learning, and some visualization analysis explained why the proposed algorithm is superior to some counterpart deep actor networks.
format Online
Article
Text
id pubmed-10361619
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-103616192023-07-22 Mixture of personality improved spiking actor network for efficient multi-agent cooperation Li, Xiyun Ni, Ziyi Ruan, Jingqing Meng, Linghui Shi, Jing Zhang, Tielin Xu, Bo Front Neurosci Neuroscience Adaptive multi-agent cooperation with especially unseen partners is becoming more challenging in multi-agent reinforcement learning (MARL) research, whereby conventional deep-learning-based algorithms suffer from the poor new-player-generalization problem, possibly caused by not considering theory-of-mind theory (ToM). Inspired by the ToM personality in cognitive psychology, where a human can easily resolve this problem by predicting others' intuitive personality first before complex actions, we propose a biologically-plausible algorithm named the mixture of personality (MoP) improved spiking actor network (SAN). The MoP module contains a determinantal point process to simulate the formation and integration of different personality types, and the SAN module contains spiking neurons for efficient reinforcement learning. The experimental results on the benchmark cooperative overcooked task showed that the proposed MoP-SAN algorithm could achieve higher performance for the paradigms with (learning) and without (generalization) unseen partners. Furthermore, ablation experiments highlighted the contribution of MoP in SAN learning, and some visualization analysis explained why the proposed algorithm is superior to some counterpart deep actor networks. Frontiers Media S.A. 2023-07-06 /pmc/articles/PMC10361619/ /pubmed/37483340 http://dx.doi.org/10.3389/fnins.2023.1219405 Text en Copyright © 2023 Li, Ni, Ruan, Meng, Shi, Zhang and Xu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Neuroscience
Li, Xiyun
Ni, Ziyi
Ruan, Jingqing
Meng, Linghui
Shi, Jing
Zhang, Tielin
Xu, Bo
Mixture of personality improved spiking actor network for efficient multi-agent cooperation
title Mixture of personality improved spiking actor network for efficient multi-agent cooperation
title_full Mixture of personality improved spiking actor network for efficient multi-agent cooperation
title_fullStr Mixture of personality improved spiking actor network for efficient multi-agent cooperation
title_full_unstemmed Mixture of personality improved spiking actor network for efficient multi-agent cooperation
title_short Mixture of personality improved spiking actor network for efficient multi-agent cooperation
title_sort mixture of personality improved spiking actor network for efficient multi-agent cooperation
topic Neuroscience
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10361619/
https://www.ncbi.nlm.nih.gov/pubmed/37483340
http://dx.doi.org/10.3389/fnins.2023.1219405
work_keys_str_mv AT lixiyun mixtureofpersonalityimprovedspikingactornetworkforefficientmultiagentcooperation
AT niziyi mixtureofpersonalityimprovedspikingactornetworkforefficientmultiagentcooperation
AT ruanjingqing mixtureofpersonalityimprovedspikingactornetworkforefficientmultiagentcooperation
AT menglinghui mixtureofpersonalityimprovedspikingactornetworkforefficientmultiagentcooperation
AT shijing mixtureofpersonalityimprovedspikingactornetworkforefficientmultiagentcooperation
AT zhangtielin mixtureofpersonalityimprovedspikingactornetworkforefficientmultiagentcooperation
AT xubo mixtureofpersonalityimprovedspikingactornetworkforefficientmultiagentcooperation