Cargando…
Learn Quasi-Stationary Distributions of Finite State Markov Chain
We propose a reinforcement learning (RL) approach to compute the expression of quasi-stationary distribution. Based on the fixed-point formulation of quasi-stationary distribution, we minimize the KL-divergence of two Markovian path distributions induced by candidate distribution and true target dis...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8774945/ https://www.ncbi.nlm.nih.gov/pubmed/35052159 http://dx.doi.org/10.3390/e24010133 |
_version_ | 1784636465914839040 |
---|---|
author | Cai, Zhiqiang Lin, Ling Zhou, Xiang |
author_facet | Cai, Zhiqiang Lin, Ling Zhou, Xiang |
author_sort | Cai, Zhiqiang |
collection | PubMed |
description | We propose a reinforcement learning (RL) approach to compute the expression of quasi-stationary distribution. Based on the fixed-point formulation of quasi-stationary distribution, we minimize the KL-divergence of two Markovian path distributions induced by candidate distribution and true target distribution. To solve this challenging minimization problem by gradient descent, we apply a reinforcement learning technique by introducing the reward and value functions. We derive the corresponding policy gradient theorem and design an actor-critic algorithm to learn the optimal solution and the value function. The numerical examples of finite state Markov chain are tested to demonstrate the new method. |
format | Online Article Text |
id | pubmed-8774945 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-87749452022-01-21 Learn Quasi-Stationary Distributions of Finite State Markov Chain Cai, Zhiqiang Lin, Ling Zhou, Xiang Entropy (Basel) Article We propose a reinforcement learning (RL) approach to compute the expression of quasi-stationary distribution. Based on the fixed-point formulation of quasi-stationary distribution, we minimize the KL-divergence of two Markovian path distributions induced by candidate distribution and true target distribution. To solve this challenging minimization problem by gradient descent, we apply a reinforcement learning technique by introducing the reward and value functions. We derive the corresponding policy gradient theorem and design an actor-critic algorithm to learn the optimal solution and the value function. The numerical examples of finite state Markov chain are tested to demonstrate the new method. MDPI 2022-01-17 /pmc/articles/PMC8774945/ /pubmed/35052159 http://dx.doi.org/10.3390/e24010133 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Cai, Zhiqiang Lin, Ling Zhou, Xiang Learn Quasi-Stationary Distributions of Finite State Markov Chain |
title | Learn Quasi-Stationary Distributions of Finite State Markov Chain |
title_full | Learn Quasi-Stationary Distributions of Finite State Markov Chain |
title_fullStr | Learn Quasi-Stationary Distributions of Finite State Markov Chain |
title_full_unstemmed | Learn Quasi-Stationary Distributions of Finite State Markov Chain |
title_short | Learn Quasi-Stationary Distributions of Finite State Markov Chain |
title_sort | learn quasi-stationary distributions of finite state markov chain |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8774945/ https://www.ncbi.nlm.nih.gov/pubmed/35052159 http://dx.doi.org/10.3390/e24010133 |
work_keys_str_mv | AT caizhiqiang learnquasistationarydistributionsoffinitestatemarkovchain AT linling learnquasistationarydistributionsoffinitestatemarkovchain AT zhouxiang learnquasistationarydistributionsoffinitestatemarkovchain |