Cargando…

Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning

To improve the convergence rate and the sample efficiency, two efficient learning methods AC-HMLP and RAC-HMLP (AC-HMLP with ℓ (2)-regularization) are proposed by combining actor-critic algorithm with hierarchical model learning and planning. The hierarchical models consisting of the local and the g...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhong, Shan, Liu, Quan, Fu, QiMing
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Hindawi Publishing Corporation 2016
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5066029/ https://www.ncbi.nlm.nih.gov/pubmed/27795704 http://dx.doi.org/10.1155/2016/4824072

_version_	1782460412272312320
author	Zhong, Shan Liu, Quan Fu, QiMing
author_facet	Zhong, Shan Liu, Quan Fu, QiMing
author_sort	Zhong, Shan
collection	PubMed
description	To improve the convergence rate and the sample efficiency, two efficient learning methods AC-HMLP and RAC-HMLP (AC-HMLP with ℓ (2)-regularization) are proposed by combining actor-critic algorithm with hierarchical model learning and planning. The hierarchical models consisting of the local and the global models, which are learned at the same time during learning of the value function and the policy, are approximated by local linear regression (LLR) and linear function approximation (LFA), respectively. Both the local model and the global model are applied to generate samples for planning; the former is used only if the state-prediction error does not surpass the threshold at each time step, while the latter is utilized at the end of each episode. The purpose of taking both models is to improve the sample efficiency and accelerate the convergence rate of the whole algorithm through fully utilizing the local and global information. Experimentally, AC-HMLP and RAC-HMLP are compared with three representative algorithms on two Reinforcement Learning (RL) benchmark problems. The results demonstrate that they perform best in terms of convergence rate and sample efficiency.
format	Online Article Text
id	pubmed-5066029
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	Hindawi Publishing Corporation
record_format	MEDLINE/PubMed
spelling	pubmed-50660292016-10-30 Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning Zhong, Shan Liu, Quan Fu, QiMing Comput Intell Neurosci Research Article To improve the convergence rate and the sample efficiency, two efficient learning methods AC-HMLP and RAC-HMLP (AC-HMLP with ℓ (2)-regularization) are proposed by combining actor-critic algorithm with hierarchical model learning and planning. The hierarchical models consisting of the local and the global models, which are learned at the same time during learning of the value function and the policy, are approximated by local linear regression (LLR) and linear function approximation (LFA), respectively. Both the local model and the global model are applied to generate samples for planning; the former is used only if the state-prediction error does not surpass the threshold at each time step, while the latter is utilized at the end of each episode. The purpose of taking both models is to improve the sample efficiency and accelerate the convergence rate of the whole algorithm through fully utilizing the local and global information. Experimentally, AC-HMLP and RAC-HMLP are compared with three representative algorithms on two Reinforcement Learning (RL) benchmark problems. The results demonstrate that they perform best in terms of convergence rate and sample efficiency. Hindawi Publishing Corporation 2016 2016-10-03 /pmc/articles/PMC5066029/ /pubmed/27795704 http://dx.doi.org/10.1155/2016/4824072 Text en Copyright © 2016 Shan Zhong et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Zhong, Shan Liu, Quan Fu, QiMing Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning
title	Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning
title_full	Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning
title_fullStr	Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning
title_full_unstemmed	Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning
title_short	Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning
title_sort	efficient actor-critic algorithm with hierarchical model learning and planning
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5066029/ https://www.ncbi.nlm.nih.gov/pubmed/27795704 http://dx.doi.org/10.1155/2016/4824072
work_keys_str_mv	AT zhongshan efficientactorcriticalgorithmwithhierarchicalmodellearningandplanning AT liuquan efficientactorcriticalgorithmwithhierarchicalmodellearningandplanning AT fuqiming efficientactorcriticalgorithmwithhierarchicalmodellearningandplanning

Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning

Ejemplares similares