DeepQ步进：在不平坦的地形上反应性动态行走的框架

论文标题

DeepQ步进：在不平坦的地形上反应性动态行走的框架

DeepQ Stepper: A framework for reactive dynamic walking on uneven terrain

论文作者

Meduri, Avadesh, Khadiv, Majid, Righetti, Ludovic

论文摘要

由于难以计算非线性动态模型的捕获区域的困难，因此，围绕双子机器人的反应性步进和推动恢复通常仅限于平坦的地形。在本文中，我们通过使用强化学习来近似学习此类系统的3D捕获区域来解决此限制。我们提出了一种新颖的3D反应步进，即DeepQ步进，该步进速度是使用动作值函数近似的3D捕获区域计算以不同速度行走的最佳步骤位置。我们演示了通过简化的3D摆模型和完整的机器人动力学学习踏进阶梯的方法的能力。此外，在考虑到基于简化模型的现有反应式步进器中通常会忽略的机器人的整个动力，因此步进阶段会在学习近似捕获区域时获得更高的性能。 DeepQ步进器可以用障碍物处理非凸形地形，在诸如步进石材之类的受限表面上行走，并从外部干扰中恢复以持续的计算成本。

Reactive stepping and push recovery for biped robots is often restricted to flat terrains because of the difficulty in computing capture regions for nonlinear dynamic models. In this paper, we address this limitation by using reinforcement learning to approximately learn the 3D capture region for such systems. We propose a novel 3D reactive stepper, The DeepQ stepper, that computes optimal step locations for walking at different velocities using the 3D capture regions approximated by the action-value function. We demonstrate the ability of the approach to learn stepping with a simplified 3D pendulum model and a full robot dynamics. Further, the stepper achieves a higher performance when it learns approximate capture regions while taking into account the entire dynamics of the robot that are often ignored in existing reactive steppers based on simplified models. The DeepQ stepper can handle non convex terrain with obstacles, walk on restricted surfaces like stepping stones and recover from external disturbances for a constant computational cost.

下载PDF全文

下载文献需遵守相关版权规定

论文标题