Cargando…

Learn Quasi-Stationary Distributions of Finite State Markov Chain

We propose a reinforcement learning (RL) approach to compute the expression of quasi-stationary distribution. Based on the fixed-point formulation of quasi-stationary distribution, we minimize the KL-divergence of two Markovian path distributions induced by candidate distribution and true target dis...

Descripción completa

Detalles Bibliográficos
Autores principales:	Cai, Zhiqiang, Lin, Ling, Zhou, Xiang
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8774945/ https://www.ncbi.nlm.nih.gov/pubmed/35052159 http://dx.doi.org/10.3390/e24010133

_version_	1784636465914839040
author	Cai, Zhiqiang Lin, Ling Zhou, Xiang
author_facet	Cai, Zhiqiang Lin, Ling Zhou, Xiang
author_sort	Cai, Zhiqiang
collection	PubMed
description	We propose a reinforcement learning (RL) approach to compute the expression of quasi-stationary distribution. Based on the fixed-point formulation of quasi-stationary distribution, we minimize the KL-divergence of two Markovian path distributions induced by candidate distribution and true target distribution. To solve this challenging minimization problem by gradient descent, we apply a reinforcement learning technique by introducing the reward and value functions. We derive the corresponding policy gradient theorem and design an actor-critic algorithm to learn the optimal solution and the value function. The numerical examples of finite state Markov chain are tested to demonstrate the new method.
format	Online Article Text
id	pubmed-8774945
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-87749452022-01-21 Learn Quasi-Stationary Distributions of Finite State Markov Chain Cai, Zhiqiang Lin, Ling Zhou, Xiang Entropy (Basel) Article We propose a reinforcement learning (RL) approach to compute the expression of quasi-stationary distribution. Based on the fixed-point formulation of quasi-stationary distribution, we minimize the KL-divergence of two Markovian path distributions induced by candidate distribution and true target distribution. To solve this challenging minimization problem by gradient descent, we apply a reinforcement learning technique by introducing the reward and value functions. We derive the corresponding policy gradient theorem and design an actor-critic algorithm to learn the optimal solution and the value function. The numerical examples of finite state Markov chain are tested to demonstrate the new method. MDPI 2022-01-17 /pmc/articles/PMC8774945/ /pubmed/35052159 http://dx.doi.org/10.3390/e24010133 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Cai, Zhiqiang Lin, Ling Zhou, Xiang Learn Quasi-Stationary Distributions of Finite State Markov Chain
title	Learn Quasi-Stationary Distributions of Finite State Markov Chain
title_full	Learn Quasi-Stationary Distributions of Finite State Markov Chain
title_fullStr	Learn Quasi-Stationary Distributions of Finite State Markov Chain
title_full_unstemmed	Learn Quasi-Stationary Distributions of Finite State Markov Chain
title_short	Learn Quasi-Stationary Distributions of Finite State Markov Chain
title_sort	learn quasi-stationary distributions of finite state markov chain
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8774945/ https://www.ncbi.nlm.nih.gov/pubmed/35052159 http://dx.doi.org/10.3390/e24010133
work_keys_str_mv	AT caizhiqiang learnquasistationarydistributionsoffinitestatemarkovchain AT linling learnquasistationarydistributionsoffinitestatemarkovchain AT zhouxiang learnquasistationarydistributionsoffinitestatemarkovchain

Learn Quasi-Stationary Distributions of Finite State Markov Chain

Ejemplares similares