论文标题

使用动态潜在层次结构的长摩根视频预测

Long-horizon video prediction using a dynamic latent hierarchy

论文作者

Zakharov, Alexey, Guo, Qinghai, Fountas, Zafeirios

论文摘要

众所周知,视频预测和发电的任务是很困难的,该领域的研究在很大程度上仅限于短期预测。尽管困扰着噪声和随机性,但视频由时空层次结构组织的特征组成,具有不同的时间动力学的不同特征。在本文中,我们介绍了动态潜在层次结构(DLH) - 一个深层的层次潜在模型,将视频表示为潜在状态的层次结构,这些状态可以随着单独和流体的时间表而发展。每个潜在状态都是具有两个组件的混合物分布,代表了直接的过去和预测的未来,导致模型仅在足够不同的状态之间学习过渡,同时将时间持续的持续状态聚集在一起。使用这种独特的属性,DLH自然会发现数据集的时空结构,并在其层次结构中学习了分离的表示形式。我们假设,这简化了建模视频的时间动力学,改善长期依赖性的学习并减少错误积累的任务。作为证据,我们证明了DLH在视频预测中胜过最先进的基准,能够更好地表示随机性,并动态调整其层次结构和时间结构。我们的论文表明,在表示学习中的进步如何转化为预测任务的进步。

The task of video prediction and generation is known to be notoriously difficult, with the research in this area largely limited to short-term predictions. Though plagued with noise and stochasticity, videos consist of features that are organised in a spatiotemporal hierarchy, different features possessing different temporal dynamics. In this paper, we introduce Dynamic Latent Hierarchy (DLH) -- a deep hierarchical latent model that represents videos as a hierarchy of latent states that evolve over separate and fluid timescales. Each latent state is a mixture distribution with two components, representing the immediate past and the predicted future, causing the model to learn transitions only between sufficiently dissimilar states, while clustering temporally persistent states closer together. Using this unique property, DLH naturally discovers the spatiotemporal structure of a dataset and learns disentangled representations across its hierarchy. We hypothesise that this simplifies the task of modeling temporal dynamics of a video, improves the learning of long-term dependencies, and reduces error accumulation. As evidence, we demonstrate that DLH outperforms state-of-the-art benchmarks in video prediction, is able to better represent stochasticity, as well as to dynamically adjust its hierarchical and temporal structure. Our paper shows, among other things, how progress in representation learning can translate into progress in prediction tasks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源