论文标题
学习时空挤压池的学习时空表示
Learning spatio-temporal representations with temporal squeeze pooling
论文作者
论文摘要
在本文中,我们提出了一种新的视频表示学习方法,名为“暂时挤压”(TS)合并,该方法可以从一系列视频框架中提取基本运动信息,并将其映射到一组少数图像中,称为挤压图像。通过将临时挤压池嵌入到现成的卷积神经网络(CNN)中,我们设计了一种新的视频分类模型,称为“暂时挤压网络”(TESNET)。所得的挤压图像包含视频帧中的基本运动信息,对应于视频分类任务的优化。我们在两个视频分类基准上评估了我们的体系结构,并将获得的结果与最新的架构进行了比较。
In this paper, we propose a new video representation learning method, named Temporal Squeeze (TS) pooling, which can extract the essential movement information from a long sequence of video frames and map it into a set of few images, named Squeezed Images. By embedding the Temporal Squeeze pooling as a layer into off-the-shelf Convolution Neural Networks (CNN), we design a new video classification model, named Temporal Squeeze Network (TeSNet). The resulting Squeezed Images contain the essential movement information from the video frames, corresponding to the optimization of the video classification task. We evaluate our architecture on two video classification benchmarks, and the results achieved are compared to the state-of-the-art.