Cargando…

Efficient Reinforcement Learning from Demonstration via Bayesian Network-Based Knowledge Extraction

Reinforcement learning from demonstration (RLfD) is considered to be a promising approach to improve reinforcement learning (RL) by leveraging expert demonstrations as the additional decision-making guidance. However, most existing RLfD methods only regard demonstrations as low-level knowledge insta...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhang, Yichuan, Lan, Yixing, Fang, Qiang, Xu, Xin, Li, Junxiang, Zeng, Yujun
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Hindawi 2021
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8486502/ https://www.ncbi.nlm.nih.gov/pubmed/34603434 http://dx.doi.org/10.1155/2021/7588221

_version_	1784577752274305024
author	Zhang, Yichuan Lan, Yixing Fang, Qiang Xu, Xin Li, Junxiang Zeng, Yujun
author_facet	Zhang, Yichuan Lan, Yixing Fang, Qiang Xu, Xin Li, Junxiang Zeng, Yujun
author_sort	Zhang, Yichuan
collection	PubMed
description	Reinforcement learning from demonstration (RLfD) is considered to be a promising approach to improve reinforcement learning (RL) by leveraging expert demonstrations as the additional decision-making guidance. However, most existing RLfD methods only regard demonstrations as low-level knowledge instances under a certain task. Demonstrations are generally used to either provide additional rewards or pretrain the neural network-based RL policy in a supervised manner, usually resulting in poor generalization capability and weak robustness performance. Considering that human knowledge is not only interpretable but also suitable for generalization, we propose to exploit the potential of demonstrations by extracting knowledge from them via Bayesian networks and develop a novel RLfD method called Reinforcement Learning from demonstration via Bayesian Network-based Knowledge (RLBNK). The proposed RLBNK method takes advantage of node influence with the Wasserstein distance metric (NIW) algorithm to obtain abstract concepts from demonstrations and then a Bayesian network conducts knowledge learning and inference based on the abstract data set, which will yield the coarse policy with corresponding confidence. Once the coarse policy's confidence is low, another RL-based refine module will further optimize and fine-tune the policy to form a (near) optimal hybrid policy. Experimental results show that the proposed RLBNK method improves the learning efficiency of corresponding baseline RL algorithms under both normal and sparse reward settings. Furthermore, we demonstrate that our RLBNK method delivers better generalization capability and robustness than baseline methods.
format	Online Article Text
id	pubmed-8486502
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Hindawi
record_format	MEDLINE/PubMed
spelling	pubmed-84865022021-10-02 Efficient Reinforcement Learning from Demonstration via Bayesian Network-Based Knowledge Extraction Zhang, Yichuan Lan, Yixing Fang, Qiang Xu, Xin Li, Junxiang Zeng, Yujun Comput Intell Neurosci Research Article Reinforcement learning from demonstration (RLfD) is considered to be a promising approach to improve reinforcement learning (RL) by leveraging expert demonstrations as the additional decision-making guidance. However, most existing RLfD methods only regard demonstrations as low-level knowledge instances under a certain task. Demonstrations are generally used to either provide additional rewards or pretrain the neural network-based RL policy in a supervised manner, usually resulting in poor generalization capability and weak robustness performance. Considering that human knowledge is not only interpretable but also suitable for generalization, we propose to exploit the potential of demonstrations by extracting knowledge from them via Bayesian networks and develop a novel RLfD method called Reinforcement Learning from demonstration via Bayesian Network-based Knowledge (RLBNK). The proposed RLBNK method takes advantage of node influence with the Wasserstein distance metric (NIW) algorithm to obtain abstract concepts from demonstrations and then a Bayesian network conducts knowledge learning and inference based on the abstract data set, which will yield the coarse policy with corresponding confidence. Once the coarse policy's confidence is low, another RL-based refine module will further optimize and fine-tune the policy to form a (near) optimal hybrid policy. Experimental results show that the proposed RLBNK method improves the learning efficiency of corresponding baseline RL algorithms under both normal and sparse reward settings. Furthermore, we demonstrate that our RLBNK method delivers better generalization capability and robustness than baseline methods. Hindawi 2021-09-24 /pmc/articles/PMC8486502/ /pubmed/34603434 http://dx.doi.org/10.1155/2021/7588221 Text en Copyright © 2021 Yichuan Zhang et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Zhang, Yichuan Lan, Yixing Fang, Qiang Xu, Xin Li, Junxiang Zeng, Yujun Efficient Reinforcement Learning from Demonstration via Bayesian Network-Based Knowledge Extraction
title	Efficient Reinforcement Learning from Demonstration via Bayesian Network-Based Knowledge Extraction
title_full	Efficient Reinforcement Learning from Demonstration via Bayesian Network-Based Knowledge Extraction
title_fullStr	Efficient Reinforcement Learning from Demonstration via Bayesian Network-Based Knowledge Extraction
title_full_unstemmed	Efficient Reinforcement Learning from Demonstration via Bayesian Network-Based Knowledge Extraction
title_short	Efficient Reinforcement Learning from Demonstration via Bayesian Network-Based Knowledge Extraction
title_sort	efficient reinforcement learning from demonstration via bayesian network-based knowledge extraction
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8486502/ https://www.ncbi.nlm.nih.gov/pubmed/34603434 http://dx.doi.org/10.1155/2021/7588221
work_keys_str_mv	AT zhangyichuan efficientreinforcementlearningfromdemonstrationviabayesiannetworkbasedknowledgeextraction AT lanyixing efficientreinforcementlearningfromdemonstrationviabayesiannetworkbasedknowledgeextraction AT fangqiang efficientreinforcementlearningfromdemonstrationviabayesiannetworkbasedknowledgeextraction AT xuxin efficientreinforcementlearningfromdemonstrationviabayesiannetworkbasedknowledgeextraction AT lijunxiang efficientreinforcementlearningfromdemonstrationviabayesiannetworkbasedknowledgeextraction AT zengyujun efficientreinforcementlearningfromdemonstrationviabayesiannetworkbasedknowledgeextraction

Efficient Reinforcement Learning from Demonstration via Bayesian Network-Based Knowledge Extraction

Ejemplares similares