Cargando…

Intelligent L2-L∞ Consensus of Multiagent Systems under Switching Topologies via Fuzzy Deep Q Learning

The problem of intelligent L(2)-L(∞) consensus design for leader-followers multiagent systems (MASs) under switching topologies is investigated based on switched control theory and fuzzy deep Q learning. It is supposed that the communication topologies are time-varying, and the model of MASs under s...

Descripción completa

Detalles Bibliográficos
Autores principales: Cheng, Haoyu, Xu, Linpeng, Song, Ruijia, Zhu, Yue, Fang, Yangwang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8865973/
https://www.ncbi.nlm.nih.gov/pubmed/35222626
http://dx.doi.org/10.1155/2022/4105546
Descripción
Sumario:The problem of intelligent L(2)-L(∞) consensus design for leader-followers multiagent systems (MASs) under switching topologies is investigated based on switched control theory and fuzzy deep Q learning. It is supposed that the communication topologies are time-varying, and the model of MASs under switching topologies is constructed based on switched systems. By employing linear transformation, the problem of consensus of MASs is converted into the issue of L(2)-L(∞) control. The consensus protocol is composed of the dynamics-based protocol and learning-based protocol, where the robust control theory and deep Q learning are applied for the two parts to guarantee the prescribed performance and improve the transient performance. The multiple Lyapunov function (MLF) method and mode-dependent average dwell time (MDADT) method are combined to give the scheduling interval, which ensures stability and prescribed attenuation performance. The sufficient existing conditions of consensus protocol are given, and the solutions of the dynamics-based protocol are derived based on linear matrix inequalities (LMIs). Then, the online design of the learning-based protocol is formulated as a Markov decision process, where the fuzzy deep Q learning is utilized to compensate for the uncertainties and achieve optimal performance. The variation of the learning-based protocol is modeled as the external compensation on the dynamics-based protocol. Therefore, the convergence of the proposed protocol can be guaranteed by employing the nonfragile control theory. In the end, a numerical example is given to validate the effectiveness and superiority of the proposed method.