Cargando…
Mixture of personality improved spiking actor network for efficient multi-agent cooperation
Adaptive multi-agent cooperation with especially unseen partners is becoming more challenging in multi-agent reinforcement learning (MARL) research, whereby conventional deep-learning-based algorithms suffer from the poor new-player-generalization problem, possibly caused by not considering theory-o...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10361619/ https://www.ncbi.nlm.nih.gov/pubmed/37483340 http://dx.doi.org/10.3389/fnins.2023.1219405 |
_version_ | 1785076256975355904 |
---|---|
author | Li, Xiyun Ni, Ziyi Ruan, Jingqing Meng, Linghui Shi, Jing Zhang, Tielin Xu, Bo |
author_facet | Li, Xiyun Ni, Ziyi Ruan, Jingqing Meng, Linghui Shi, Jing Zhang, Tielin Xu, Bo |
author_sort | Li, Xiyun |
collection | PubMed |
description | Adaptive multi-agent cooperation with especially unseen partners is becoming more challenging in multi-agent reinforcement learning (MARL) research, whereby conventional deep-learning-based algorithms suffer from the poor new-player-generalization problem, possibly caused by not considering theory-of-mind theory (ToM). Inspired by the ToM personality in cognitive psychology, where a human can easily resolve this problem by predicting others' intuitive personality first before complex actions, we propose a biologically-plausible algorithm named the mixture of personality (MoP) improved spiking actor network (SAN). The MoP module contains a determinantal point process to simulate the formation and integration of different personality types, and the SAN module contains spiking neurons for efficient reinforcement learning. The experimental results on the benchmark cooperative overcooked task showed that the proposed MoP-SAN algorithm could achieve higher performance for the paradigms with (learning) and without (generalization) unseen partners. Furthermore, ablation experiments highlighted the contribution of MoP in SAN learning, and some visualization analysis explained why the proposed algorithm is superior to some counterpart deep actor networks. |
format | Online Article Text |
id | pubmed-10361619 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-103616192023-07-22 Mixture of personality improved spiking actor network for efficient multi-agent cooperation Li, Xiyun Ni, Ziyi Ruan, Jingqing Meng, Linghui Shi, Jing Zhang, Tielin Xu, Bo Front Neurosci Neuroscience Adaptive multi-agent cooperation with especially unseen partners is becoming more challenging in multi-agent reinforcement learning (MARL) research, whereby conventional deep-learning-based algorithms suffer from the poor new-player-generalization problem, possibly caused by not considering theory-of-mind theory (ToM). Inspired by the ToM personality in cognitive psychology, where a human can easily resolve this problem by predicting others' intuitive personality first before complex actions, we propose a biologically-plausible algorithm named the mixture of personality (MoP) improved spiking actor network (SAN). The MoP module contains a determinantal point process to simulate the formation and integration of different personality types, and the SAN module contains spiking neurons for efficient reinforcement learning. The experimental results on the benchmark cooperative overcooked task showed that the proposed MoP-SAN algorithm could achieve higher performance for the paradigms with (learning) and without (generalization) unseen partners. Furthermore, ablation experiments highlighted the contribution of MoP in SAN learning, and some visualization analysis explained why the proposed algorithm is superior to some counterpart deep actor networks. Frontiers Media S.A. 2023-07-06 /pmc/articles/PMC10361619/ /pubmed/37483340 http://dx.doi.org/10.3389/fnins.2023.1219405 Text en Copyright © 2023 Li, Ni, Ruan, Meng, Shi, Zhang and Xu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Neuroscience Li, Xiyun Ni, Ziyi Ruan, Jingqing Meng, Linghui Shi, Jing Zhang, Tielin Xu, Bo Mixture of personality improved spiking actor network for efficient multi-agent cooperation |
title | Mixture of personality improved spiking actor network for efficient multi-agent cooperation |
title_full | Mixture of personality improved spiking actor network for efficient multi-agent cooperation |
title_fullStr | Mixture of personality improved spiking actor network for efficient multi-agent cooperation |
title_full_unstemmed | Mixture of personality improved spiking actor network for efficient multi-agent cooperation |
title_short | Mixture of personality improved spiking actor network for efficient multi-agent cooperation |
title_sort | mixture of personality improved spiking actor network for efficient multi-agent cooperation |
topic | Neuroscience |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10361619/ https://www.ncbi.nlm.nih.gov/pubmed/37483340 http://dx.doi.org/10.3389/fnins.2023.1219405 |
work_keys_str_mv | AT lixiyun mixtureofpersonalityimprovedspikingactornetworkforefficientmultiagentcooperation AT niziyi mixtureofpersonalityimprovedspikingactornetworkforefficientmultiagentcooperation AT ruanjingqing mixtureofpersonalityimprovedspikingactornetworkforefficientmultiagentcooperation AT menglinghui mixtureofpersonalityimprovedspikingactornetworkforefficientmultiagentcooperation AT shijing mixtureofpersonalityimprovedspikingactornetworkforefficientmultiagentcooperation AT zhangtielin mixtureofpersonalityimprovedspikingactornetworkforefficientmultiagentcooperation AT xubo mixtureofpersonalityimprovedspikingactornetworkforefficientmultiagentcooperation |