避免障碍和导航利用奖励成型的加固学习

论文标题

避免障碍和导航利用奖励成型的加固学习

Obstacle Avoidance and Navigation Utilizing Reinforcement Learning with Reward Shaping

论文作者

Zhang, Daniel, Bailey, Colleen P.

论文摘要

在本文中，我们研究了机器人控制区域中的避免障碍和导航问题。为了解决此类问题，我们提出了修订的深层确定性政策梯度（DDPG）和近端政策优化算法，并通过改进的奖励成型技术进行了修订。我们将原始DDPG和PPO之间的性能与具有真实移动机器人的模拟上的两者进行了修订版，并证明所提出的算法取得了更好的结果。

In this paper, we investigate the obstacle avoidance and navigation problem in the robotic control area. For solving such a problem, we propose revised Deep Deterministic Policy Gradient (DDPG) and Proximal Policy Optimization algorithms with an improved reward shaping technique. We compare the performances between the original DDPG and PPO with the revised version of both on simulations with a real mobile robot and demonstrate that the proposed algorithms achieve better results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题