Cargando…

Relative Entropy of Correct Proximal Policy Optimization Algorithms with Modified Penalty Factor in Complex Environment

In the field of reinforcement learning, we propose a Correct Proximal Policy Optimization (CPPO) algorithm based on the modified penalty factor β and relative entropy in order to solve the robustness and stationarity of traditional algorithms. Firstly, In the process of reinforcement learning, this...

Descripción completa

Detalles Bibliográficos
Autores principales:	Chen, Weimin, Wong, Kelvin Kian Loong, Long, Sifan, Sun, Zhili
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9031020/ https://www.ncbi.nlm.nih.gov/pubmed/35455103 http://dx.doi.org/10.3390/e24040440

Ejemplares similares

A penalty-based algorithm proposal for engineering optimization problems
por: Oztas, Gulin Zeynep, et al.
Publicado: (2022)

A Geometrical Perspective for the Bargaining Problem
por: Wong, Kelvin Kian Loong
Publicado: (2010)

An Inexact Penalty Decomposition Method for Sparse Optimization
por: Dong, Zhengshan, et al.
Publicado: (2021)

The Effects of Modifying the Distance of the Penalty Shot in Water Polo
por: Argudo, Francisco Manuel, et al.
Publicado: (2016)

Density-Based Penalty Parameter Optimization on C-SVM
por: Liu, Yun, et al.
Publicado: (2014)

The Death Penalty
Publicado: (1948)

The Penalties of Weakness
Publicado: (1888)

The Death Penalty
Publicado: (1890)

The Death Penalty
Publicado: (1889)

Penalty of Overwork
Publicado: (1861)

Automatic Target Recognition Based on Cross-Plot
por: Wong, Kelvin Kian Loong, et al.
Publicado: (2011)

Penalty Dynamic Programming Algorithm for Dim Targets Detection in Sensor Systems
por: Huang, Dayu, et al.
Publicado: (2012)

An interior-point $l_{1}$-penalty method for nonlinear optimization
por: Gould, N I M, et al.
Publicado: (2003)

An Economic Model of Optimal Penalty for Health Care Workplace Violence
por: Sun, Zesheng, et al.
Publicado: (2019)

Decision models of emission reduction considering CSR under reward-penalty policy
por: Wang, Yang, et al.
Publicado: (2023)

Pensions without Penalties
Publicado: (1892)

Rape and Its Penalty
Publicado: (1904)

Penalties of Public Singers
Publicado: (1874)

Computational hemodynamics: theory, modelling and applications
por: Tu, Jiyuan, et al.
Publicado: (2015)

Is There a Rural Penalty in Language Acquisition? Evidence From Germany's Refugee Allocation Policy
por: Khalil, Samir, et al.
Publicado: (2022)

The modified proximal point algorithm in Hadamard spaces
por: Chang, Shih-sen, et al.
Publicado: (2018)

Drought-Tolerant Corn Hybrids Yield More in Drought-Stressed Environments with No Penalty in Non-stressed Environments
por: Adee, Eric, et al.
Publicado: (2016)

Penalties for Non-Payment of Arrears
Publicado: (1914)

Penalties Attached to the Registration of Deaths
por: Pagan, J. M., et al.
Publicado: (1860)

The Penalty of Neglecting Veterinary Medicine
Publicado: (1902)

Glycemic penalty index for adequately assessing and comparing different blood glucose control algorithms
por: Van Herpe, Tom, et al.
Publicado: (2008)

Research on the Multiagent Joint Proximal Policy Optimization Algorithm Controlling Cooperative Fixed-Wing UAV Obstacle Avoidance
por: Zhao, Weiwei, et al.
Publicado: (2020)

Smoothing approximation to the lower order exact penalty function for inequality constrained optimization
por: Lian, Shujun, et al.
Publicado: (2018)

Minimizing the Entropy Penalty for Ligand Binding: Lessons from the Molecular Recognition of the Histo Blood‐Group Antigens by Human Galectin‐3
por: Gimeno, Ana, et al.
Publicado: (2019)

EFFECT OF PENALTY MINUTE RULE CHANGE ON INJURIES AND GAME DISQUALIFICATION PENALTIES IN HIGH SCHOOL ICE HOCKEY
por: Kriz, Peter, et al.
Publicado: (2019)

Effect of Bayesian penalty likelihood algorithm on 18F-FDG PET/CT image of lymphoma
por: Wang, Yongtao, et al.
Publicado: (2021)

Live Multiattribute Data Mining and Penalty Decision-Making in Basketball Games Based on the Apriori Algorithm
por: Zeng, Jian, et al.
Publicado: (2022)

Optimizing Age Penalty in Time-Varying Networks with Markovian and Error-Prone Channel State
por: Chen, Yuchao, et al.
Publicado: (2021)

Protein sequence optimization with a pairwise decomposable penalty for buried unsatisfied hydrogen bonds
por: Coventry, Brian, et al.
Publicado: (2021)

Death Penalty and Psychiatric Evaluation in Japan
por: Kashiwagi, Hiroko, et al.
Publicado: (2018)

The albedo-climate penalty of hydropower reservoirs
por: Wohlfahrt, Georg, et al.
Publicado: (2021)

The social capital penalty paid by teetotallers
por: Walker, Benjamin, et al.
Publicado: (2023)

Automatic Management of Cloud Applications with Use of Proximal Policy Optimization
por: Funika, Włodzimierz, et al.
Publicado: (2020)

Quantum architecture search via truly proximal policy optimization
por: Zhu, Xianchao, et al.
Publicado: (2023)

An Efficient, Parallelized Algorithm for Optimal Conditional Entropy-Based Feature Selection
por: Estrela, Gustavo, et al.
Publicado: (2020)

Cannot write session to /tmp/vufind_sessions/sess_ojn9jv8ptoemn9hb13pu3ftlub