通过后继表示的离散国家行动抽象

论文标题

通过后继表示的离散国家行动抽象

Discrete State-Action Abstraction via the Successor Representation

论文作者

Attali, Amnon, Cisneros-Velarde, Pedro, Morales, Marco, Amato, Nancy M.

论文摘要

尽管加强学习问题的困难通常与其状态空间的复杂性有关，但抽象表明解决方案通常位于更简单的潜在空间中。先前的作品专注于学习连续或密集的抽象，或者要求人类提供一个。信息密度表示的捕获功能与解决任务无关，而连续的空间可能难以代表离散的对象。在这项工作中，我们会自动学习基础环境的稀疏离散抽象。我们使用基于后继表示和最大凝聚正则化的简单端到端可训练模型。我们描述了一种应用我们的模型的算法，称为离散状态抽象（DSAA），该算法以时间扩展的动作（即选项）的形式计算出动作抽象以在离散抽象状态之间过渡。从经验上讲，我们证明了不同探索方案对我们由此产生的抽象的影响，并表明它可以有效地解决下游任务。

While the difficulty of reinforcement learning problems is typically related to the complexity of their state spaces, Abstraction proposes that solutions often lie in simpler underlying latent spaces. Prior works have focused on learning either a continuous or dense abstraction, or require a human to provide one. Information-dense representations capture features irrelevant for solving tasks, and continuous spaces can struggle to represent discrete objects. In this work we automatically learn a sparse discrete abstraction of the underlying environment. We do so using a simple end-to-end trainable model based on the successor representation and max-entropy regularization. We describe an algorithm to apply our model, named Discrete State-Action Abstraction (DSAA), which computes an action abstraction in the form of temporally extended actions, i.e., Options, to transition between discrete abstract states. Empirically, we demonstrate the effects of different exploration schemes on our resulting abstraction, and show that it is efficient for solving downstream tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题