$ \ text {h} _ {\ infty} $通过可变增益梯度下降基于基于梯度的积分增强学习，用于未知连续时间非线性系统

论文标题

$ \ text {h} _ {\ infty} $通过可变增益梯度下降基于基于梯度的积分增强学习，用于未知连续时间非线性系统

$\text{H}_{\infty}$ Tracking Control via Variable Gain Gradient Descent-Based Integral Reinforcement Learning for Unknown Continuous Time Nonlinear System

论文作者

Mishra, Amardeep, Ghosh, Satadal

论文摘要

连续时间非线性系统的最佳跟踪已在文献中进行了广泛的研究。但是，在几种应用中，缺乏有关系统动力学的知识在解决最佳跟踪问题方面构成了严重的挑战。这已经发现了最近研究人员的关注，并且与Actor神经网络（NN）增强的基于积分的增强学习（IRL）的方法已被部署到此目的。但是，很少有研究被指导用于建模$ h _ {\ infty} $最佳跟踪控制，这有助于减弱干扰对系统性能的影响，而无需任何有关系统动态的先验知识。为此，最近提出了一个基于递归最小成方的参数更新。但是，基于梯度下降的参数更新方案对植物动力学的实时变化更为敏感。已经证明，经验重播（ER）技术可以通过迭代地使用过去的观察结果来改善NN权重的收敛性。由这些动机，本文介绍了一项新的参数更新定律，该法律基于可变增益梯度下降和经验重播技术来调整评论家，演员和干扰NNS的权重。

Optimal tracking of continuous time nonlinear systems has been extensively studied in literature. However, in several applications, absence of knowledge about system dynamics poses a severe challenge to solving the optimal tracking problem. This has found growing attention among researchers recently, and integral reinforcement learning (IRL)-based method augmented with actor neural network (NN) have been deployed to this end. However, very few studies have been directed to model-free $H_{\infty}$ optimal tracking control that helps in attenuating the effect of disturbances on the system performance without any prior knowledge about system dynamics. To this end a recursive least square-based parameter update was recently proposed. However, gradient descent-based parameter update scheme is more sensitive to real-time variation in plant dynamics. And experience replay (ER) technique has been shown to improve the convergence of NN weights by utilizing past observations iteratively. Motivated by these, this paper presents a novel parameter update law based on variable gain gradient descent and experience replay technique for tuning the weights of critic, actor and disturbance NNs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题