Cargando…
The Convergence of a Cooperation Markov Decision Process System
In a general Markov decision progress system, only one agent’s learning evolution is considered. However, considering the learning evolution of a single agent in many problems has some limitations, more and more applications involve multi-agent. There are two types of cooperation, game environment a...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7597243/ https://www.ncbi.nlm.nih.gov/pubmed/33286724 http://dx.doi.org/10.3390/e22090955 |
Sumario: | In a general Markov decision progress system, only one agent’s learning evolution is considered. However, considering the learning evolution of a single agent in many problems has some limitations, more and more applications involve multi-agent. There are two types of cooperation, game environment among multi-agent. Therefore, this paper introduces a Cooperation Markov Decision Process [Formula: see text] system with two agents, which is suitable for the learning evolution of cooperative decision between two agents. It is further found that the value function in the [Formula: see text] system also converges in the end, and the convergence value is independent of the choice of the value of the initial value function. This paper presents an algorithm for finding the optimal strategy pair [Formula: see text] in the [Formula: see text] system, whose fundamental task is to find an optimal strategy pair and form an evolutionary system [Formula: see text]. Finally, an example is given to support the theoretical results. |
---|