Cargando…

Learn Quasi-Stationary Distributions of Finite State Markov Chain

We propose a reinforcement learning (RL) approach to compute the expression of quasi-stationary distribution. Based on the fixed-point formulation of quasi-stationary distribution, we minimize the KL-divergence of two Markovian path distributions induced by candidate distribution and true target dis...

Descripción completa

Detalles Bibliográficos
Autores principales: Cai, Zhiqiang, Lin, Ling, Zhou, Xiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8774945/
https://www.ncbi.nlm.nih.gov/pubmed/35052159
http://dx.doi.org/10.3390/e24010133
_version_ 1784636465914839040
author Cai, Zhiqiang
Lin, Ling
Zhou, Xiang
author_facet Cai, Zhiqiang
Lin, Ling
Zhou, Xiang
author_sort Cai, Zhiqiang
collection PubMed
description We propose a reinforcement learning (RL) approach to compute the expression of quasi-stationary distribution. Based on the fixed-point formulation of quasi-stationary distribution, we minimize the KL-divergence of two Markovian path distributions induced by candidate distribution and true target distribution. To solve this challenging minimization problem by gradient descent, we apply a reinforcement learning technique by introducing the reward and value functions. We derive the corresponding policy gradient theorem and design an actor-critic algorithm to learn the optimal solution and the value function. The numerical examples of finite state Markov chain are tested to demonstrate the new method.
format Online
Article
Text
id pubmed-8774945
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-87749452022-01-21 Learn Quasi-Stationary Distributions of Finite State Markov Chain Cai, Zhiqiang Lin, Ling Zhou, Xiang Entropy (Basel) Article We propose a reinforcement learning (RL) approach to compute the expression of quasi-stationary distribution. Based on the fixed-point formulation of quasi-stationary distribution, we minimize the KL-divergence of two Markovian path distributions induced by candidate distribution and true target distribution. To solve this challenging minimization problem by gradient descent, we apply a reinforcement learning technique by introducing the reward and value functions. We derive the corresponding policy gradient theorem and design an actor-critic algorithm to learn the optimal solution and the value function. The numerical examples of finite state Markov chain are tested to demonstrate the new method. MDPI 2022-01-17 /pmc/articles/PMC8774945/ /pubmed/35052159 http://dx.doi.org/10.3390/e24010133 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Cai, Zhiqiang
Lin, Ling
Zhou, Xiang
Learn Quasi-Stationary Distributions of Finite State Markov Chain
title Learn Quasi-Stationary Distributions of Finite State Markov Chain
title_full Learn Quasi-Stationary Distributions of Finite State Markov Chain
title_fullStr Learn Quasi-Stationary Distributions of Finite State Markov Chain
title_full_unstemmed Learn Quasi-Stationary Distributions of Finite State Markov Chain
title_short Learn Quasi-Stationary Distributions of Finite State Markov Chain
title_sort learn quasi-stationary distributions of finite state markov chain
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8774945/
https://www.ncbi.nlm.nih.gov/pubmed/35052159
http://dx.doi.org/10.3390/e24010133
work_keys_str_mv AT caizhiqiang learnquasistationarydistributionsoffinitestatemarkovchain
AT linling learnquasistationarydistributionsoffinitestatemarkovchain
AT zhouxiang learnquasistationarydistributionsoffinitestatemarkovchain