Cargando…

The Convergence of a Cooperation Markov Decision Process System

In a general Markov decision progress system, only one agent’s learning evolution is considered. However, considering the learning evolution of a single agent in many problems has some limitations, more and more applications involve multi-agent. There are two types of cooperation, game environment a...

Descripción completa

Detalles Bibliográficos
Autores principales:	Mo, Xiaoling, Xu, Daoyun, Fu, Zufeng
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7597243/ https://www.ncbi.nlm.nih.gov/pubmed/33286724 http://dx.doi.org/10.3390/e22090955

_version_	1783602300147728384
author	Mo, Xiaoling Xu, Daoyun Fu, Zufeng
author_facet	Mo, Xiaoling Xu, Daoyun Fu, Zufeng
author_sort	Mo, Xiaoling
collection	PubMed
description	In a general Markov decision progress system, only one agent’s learning evolution is considered. However, considering the learning evolution of a single agent in many problems has some limitations, more and more applications involve multi-agent. There are two types of cooperation, game environment among multi-agent. Therefore, this paper introduces a Cooperation Markov Decision Process [Formula: see text] system with two agents, which is suitable for the learning evolution of cooperative decision between two agents. It is further found that the value function in the [Formula: see text] system also converges in the end, and the convergence value is independent of the choice of the value of the initial value function. This paper presents an algorithm for finding the optimal strategy pair [Formula: see text] in the [Formula: see text] system, whose fundamental task is to find an optimal strategy pair and form an evolutionary system [Formula: see text]. Finally, an example is given to support the theoretical results.
format	Online Article Text
id	pubmed-7597243
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-75972432020-11-09 The Convergence of a Cooperation Markov Decision Process System Mo, Xiaoling Xu, Daoyun Fu, Zufeng Entropy (Basel) Article In a general Markov decision progress system, only one agent’s learning evolution is considered. However, considering the learning evolution of a single agent in many problems has some limitations, more and more applications involve multi-agent. There are two types of cooperation, game environment among multi-agent. Therefore, this paper introduces a Cooperation Markov Decision Process [Formula: see text] system with two agents, which is suitable for the learning evolution of cooperative decision between two agents. It is further found that the value function in the [Formula: see text] system also converges in the end, and the convergence value is independent of the choice of the value of the initial value function. This paper presents an algorithm for finding the optimal strategy pair [Formula: see text] in the [Formula: see text] system, whose fundamental task is to find an optimal strategy pair and form an evolutionary system [Formula: see text]. Finally, an example is given to support the theoretical results. MDPI 2020-08-30 /pmc/articles/PMC7597243/ /pubmed/33286724 http://dx.doi.org/10.3390/e22090955 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Mo, Xiaoling Xu, Daoyun Fu, Zufeng The Convergence of a Cooperation Markov Decision Process System
title	The Convergence of a Cooperation Markov Decision Process System
title_full	The Convergence of a Cooperation Markov Decision Process System
title_fullStr	The Convergence of a Cooperation Markov Decision Process System
title_full_unstemmed	The Convergence of a Cooperation Markov Decision Process System
title_short	The Convergence of a Cooperation Markov Decision Process System
title_sort	convergence of a cooperation markov decision process system
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7597243/ https://www.ncbi.nlm.nih.gov/pubmed/33286724 http://dx.doi.org/10.3390/e22090955
work_keys_str_mv	AT moxiaoling theconvergenceofacooperationmarkovdecisionprocesssystem AT xudaoyun theconvergenceofacooperationmarkovdecisionprocesssystem AT fuzufeng theconvergenceofacooperationmarkovdecisionprocesssystem AT moxiaoling convergenceofacooperationmarkovdecisionprocesssystem AT xudaoyun convergenceofacooperationmarkovdecisionprocesssystem AT fuzufeng convergenceofacooperationmarkovdecisionprocesssystem

The Convergence of a Cooperation Markov Decision Process System

Ejemplares similares