Cargando…

Curiosity-Driven Variational Autoencoder for Deep Q Network

In recent years, deep reinforcement learning (DRL) has achieved tremendous success in high-dimensional and large-scale space control and sequential decision-making tasks. However, the current model-free DRL methods suffer from low sample efficiency, which is a bottleneck that limits their performanc...

Descripción completa

Detalles Bibliográficos
Autores principales:	Han, Gao-Jie, Zhang, Xiao-Fang, Wang, Hao, Mao, Chen-Guang
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206149/ http://dx.doi.org/10.1007/978-3-030-47426-3_59

_version_	1783530356929986560
author	Han, Gao-Jie Zhang, Xiao-Fang Wang, Hao Mao, Chen-Guang
author_facet	Han, Gao-Jie Zhang, Xiao-Fang Wang, Hao Mao, Chen-Guang
author_sort	Han, Gao-Jie
collection	PubMed
description	In recent years, deep reinforcement learning (DRL) has achieved tremendous success in high-dimensional and large-scale space control and sequential decision-making tasks. However, the current model-free DRL methods suffer from low sample efficiency, which is a bottleneck that limits their performance. To alleviate this problem, some researchers used the generative model for modeling the environment. But the generative model may become inaccurate or even collapse if the state has not been sufficiently explored. In this paper, we introduce a model called Curiosity-driven Variational Autoencoder (CVAE), which combines variational autoencoder and curiosity-driven exploration. During the training process, the CVAE model can improve sample efficiency while curiosity-driven exploration can make sufficient exploration in a complex environment. Then, a CVAE-based algorithm is proposed, namely DQN-CVAE, that scales CVAE to higher dimensional environments. Finally, the performance of our algorithm is evaluated through several Atari 2600 games, and the experimental results show that the DQN-CVAE achieves better performance in terms of average reward per episode on these games.
format	Online Article Text
id	pubmed-7206149
institution	National Center for Biotechnology Information
language	English
publishDate	2020
record_format	MEDLINE/PubMed
spelling	pubmed-72061492020-05-08 Curiosity-Driven Variational Autoencoder for Deep Q Network Han, Gao-Jie Zhang, Xiao-Fang Wang, Hao Mao, Chen-Guang Advances in Knowledge Discovery and Data Mining Article In recent years, deep reinforcement learning (DRL) has achieved tremendous success in high-dimensional and large-scale space control and sequential decision-making tasks. However, the current model-free DRL methods suffer from low sample efficiency, which is a bottleneck that limits their performance. To alleviate this problem, some researchers used the generative model for modeling the environment. But the generative model may become inaccurate or even collapse if the state has not been sufficiently explored. In this paper, we introduce a model called Curiosity-driven Variational Autoencoder (CVAE), which combines variational autoencoder and curiosity-driven exploration. During the training process, the CVAE model can improve sample efficiency while curiosity-driven exploration can make sufficient exploration in a complex environment. Then, a CVAE-based algorithm is proposed, namely DQN-CVAE, that scales CVAE to higher dimensional environments. Finally, the performance of our algorithm is evaluated through several Atari 2600 games, and the experimental results show that the DQN-CVAE achieves better performance in terms of average reward per episode on these games. 2020-04-17 /pmc/articles/PMC7206149/ http://dx.doi.org/10.1007/978-3-030-47426-3_59 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle	Article Han, Gao-Jie Zhang, Xiao-Fang Wang, Hao Mao, Chen-Guang Curiosity-Driven Variational Autoencoder for Deep Q Network
title	Curiosity-Driven Variational Autoencoder for Deep Q Network
title_full	Curiosity-Driven Variational Autoencoder for Deep Q Network
title_fullStr	Curiosity-Driven Variational Autoencoder for Deep Q Network
title_full_unstemmed	Curiosity-Driven Variational Autoencoder for Deep Q Network
title_short	Curiosity-Driven Variational Autoencoder for Deep Q Network
title_sort	curiosity-driven variational autoencoder for deep q network
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206149/ http://dx.doi.org/10.1007/978-3-030-47426-3_59
work_keys_str_mv	AT hangaojie curiositydrivenvariationalautoencoderfordeepqnetwork AT zhangxiaofang curiositydrivenvariationalautoencoderfordeepqnetwork AT wanghao curiositydrivenvariationalautoencoderfordeepqnetwork AT maochenguang curiositydrivenvariationalautoencoderfordeepqnetwork

Curiosity-Driven Variational Autoencoder for Deep Q Network

Ejemplares similares