论文标题
通过搜索在视频预处理潜在空间中进行行为克隆
Behavioral Cloning via Search in Video PreTraining Latent Space
论文作者
论文摘要
我们的目的是建立可以在Minecraft等环境中解决任务的自主代理。为此,我们使用了一种基于模仿学习的方法。我们在专家演示的数据集上提出控制问题作为搜索问题,在该数据集中,代理在图像行动对的类似演示轨迹中复制了动作。我们在视频预处理模型的潜在表示中对玄武岩矿工数据进行了近距离搜索。只要代理的状态表示与数据集的所选专家轨迹之间的距离之间的距离不差异,代理将复制专家轨迹的动作。然后重复接近搜索。我们的方法可以有效地恢复有意义的演示轨迹,并在Minecraft环境中表现出代理的类似人类行为。
Our aim is to build autonomous agents that can solve tasks in environments like Minecraft. To do so, we used an imitation learning-based approach. We formulate our control problem as a search problem over a dataset of experts' demonstrations, where the agent copies actions from a similar demonstration trajectory of image-action pairs. We perform a proximity search over the BASALT MineRL-dataset in the latent representation of a Video PreTraining model. The agent copies the actions from the expert trajectory as long as the distance between the state representations of the agent and the selected expert trajectory from the dataset do not diverge. Then the proximity search is repeated. Our approach can effectively recover meaningful demonstration trajectories and show human-like behavior of an agent in the Minecraft environment.