使用安全指数指导高斯流程模型的概率保障用于加固学习的概率保障

论文标题

使用安全指数指导高斯流程模型的概率保障用于加固学习的概率保障

Probabilistic Safeguard for Reinforcement Learning Using Safety Index Guided Gaussian Process Models

论文作者

Zhao, Weiye, He, Tairan, Liu, Changliu

论文摘要

安全是将强化学习（RL）应用于物理世界的最大问题之一。在其核心部分，确保RL代理在没有白色框或黑盒动力学模型的情况下持续满足硬状态约束是一项挑战。本文介绍了一个集成的模型学习和安全控制框架，以保护任何代理，其中将其动态作为高斯流程学习。拟议的理论提供了一种新颖的方法，用于构建一个最能实现安全要求的模型学习的离线数据集；（ii）安全指数的参数化规则，以确保存在安全控制；（iii）使用上述数据集学习模型时，就概率向前不变性方面的安全保证。仿真结果表明，我们的框架可以保证各种连续控制任务的安全性违规几乎为零。

Safety is one of the biggest concerns to applying reinforcement learning (RL) to the physical world. In its core part, it is challenging to ensure RL agents persistently satisfy a hard state constraint without white-box or black-box dynamics models. This paper presents an integrated model learning and safe control framework to safeguard any agent, where its dynamics are learned as Gaussian processes. The proposed theory provides (i) a novel method to construct an offline dataset for model learning that best achieves safety requirements; (ii) a parameterization rule for safety index to ensure the existence of safe control; (iii) a safety guarantee in terms of probabilistic forward invariance when the model is learned using the aforementioned dataset. Simulation results show that our framework guarantees almost zero safety violation on various continuous control tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题