Cargando…

Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin

Direct reciprocity, or repeated interaction, is a main mechanism to sustain cooperation under social dilemmas involving two individuals. For larger groups and networks, which are probably more relevant to understanding and engineering our society, experiments employing repeated multiplayer social di...

Descripción completa

Detalles Bibliográficos
Autores principales: Ezaki, Takahiro, Horita, Yutaka, Takezawa, Masanori, Masuda, Naoki
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4954710/
https://www.ncbi.nlm.nih.gov/pubmed/27438888
http://dx.doi.org/10.1371/journal.pcbi.1005034
_version_ 1782443817370124288
author Ezaki, Takahiro
Horita, Yutaka
Takezawa, Masanori
Masuda, Naoki
author_facet Ezaki, Takahiro
Horita, Yutaka
Takezawa, Masanori
Masuda, Naoki
author_sort Ezaki, Takahiro
collection PubMed
description Direct reciprocity, or repeated interaction, is a main mechanism to sustain cooperation under social dilemmas involving two individuals. For larger groups and networks, which are probably more relevant to understanding and engineering our society, experiments employing repeated multiplayer social dilemma games have suggested that humans often show conditional cooperation behavior and its moody variant. Mechanisms underlying these behaviors largely remain unclear. Here we provide a proximate account for this behavior by showing that individuals adopting a type of reinforcement learning, called aspiration learning, phenomenologically behave as conditional cooperator. By definition, individuals are satisfied if and only if the obtained payoff is larger than a fixed aspiration level. They reinforce actions that have resulted in satisfactory outcomes and anti-reinforce those yielding unsatisfactory outcomes. The results obtained in the present study are general in that they explain extant experimental results obtained for both so-called moody and non-moody conditional cooperation, prisoner’s dilemma and public goods games, and well-mixed groups and networks. Different from the previous theory, individuals are assumed to have no access to information about what other individuals are doing such that they cannot explicitly use conditional cooperation rules. In this sense, myopic aspiration learning in which the unconditional propensity of cooperation is modulated in every discrete time step explains conditional behavior of humans. Aspiration learners showing (moody) conditional cooperation obeyed a noisy GRIM-like strategy. This is different from the Pavlov, a reinforcement learning strategy promoting mutual cooperation in two-player situations.
format Online
Article
Text
id pubmed-4954710
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-49547102016-08-08 Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin Ezaki, Takahiro Horita, Yutaka Takezawa, Masanori Masuda, Naoki PLoS Comput Biol Research Article Direct reciprocity, or repeated interaction, is a main mechanism to sustain cooperation under social dilemmas involving two individuals. For larger groups and networks, which are probably more relevant to understanding and engineering our society, experiments employing repeated multiplayer social dilemma games have suggested that humans often show conditional cooperation behavior and its moody variant. Mechanisms underlying these behaviors largely remain unclear. Here we provide a proximate account for this behavior by showing that individuals adopting a type of reinforcement learning, called aspiration learning, phenomenologically behave as conditional cooperator. By definition, individuals are satisfied if and only if the obtained payoff is larger than a fixed aspiration level. They reinforce actions that have resulted in satisfactory outcomes and anti-reinforce those yielding unsatisfactory outcomes. The results obtained in the present study are general in that they explain extant experimental results obtained for both so-called moody and non-moody conditional cooperation, prisoner’s dilemma and public goods games, and well-mixed groups and networks. Different from the previous theory, individuals are assumed to have no access to information about what other individuals are doing such that they cannot explicitly use conditional cooperation rules. In this sense, myopic aspiration learning in which the unconditional propensity of cooperation is modulated in every discrete time step explains conditional behavior of humans. Aspiration learners showing (moody) conditional cooperation obeyed a noisy GRIM-like strategy. This is different from the Pavlov, a reinforcement learning strategy promoting mutual cooperation in two-player situations. Public Library of Science 2016-07-20 /pmc/articles/PMC4954710/ /pubmed/27438888 http://dx.doi.org/10.1371/journal.pcbi.1005034 Text en © 2016 Ezaki et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Ezaki, Takahiro
Horita, Yutaka
Takezawa, Masanori
Masuda, Naoki
Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin
title Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin
title_full Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin
title_fullStr Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin
title_full_unstemmed Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin
title_short Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin
title_sort reinforcement learning explains conditional cooperation and its moody cousin
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4954710/
https://www.ncbi.nlm.nih.gov/pubmed/27438888
http://dx.doi.org/10.1371/journal.pcbi.1005034
work_keys_str_mv AT ezakitakahiro reinforcementlearningexplainsconditionalcooperationanditsmoodycousin
AT horitayutaka reinforcementlearningexplainsconditionalcooperationanditsmoodycousin
AT takezawamasanori reinforcementlearningexplainsconditionalcooperationanditsmoodycousin
AT masudanaoki reinforcementlearningexplainsconditionalcooperationanditsmoodycousin