通过第三人称观点增强以Egintric 3D姿势估计

论文标题

通过第三人称观点增强以Egintric 3D姿势估计

Enhancing Egocentric 3D Pose Estimation with Third Person Views

论文作者

Dhamanaskar, Ameya, Dimiccoli, Mariella, Corona, Enric, Pumarola, Albert, Moreno-Noguer, Francesc

论文摘要

在本文中，我们提出了一种新颖的方法，以增强从单个可穿戴相机捕获的视频计算的人的3D身体姿势估计。关键的想法是利用链接联合嵌入空间中的第一和第三视图的高级功能。为了学习这样的嵌入空间，我们介绍了第20 pose，这是一个新的配对同步数据集，其中包含了近2,000个视频，描绘了从第一和第三视图捕获的人类活动。我们明确考虑了使用以自我监督的方式训练的半启用建筑结合的空间和运动域特征。实验结果表明，使用数据集学习的联合多视图嵌入式空间对于从任意单视中心的视频中提取歧视性特征很有用，而无需域名适应或对摄像机参数的了解。我们在两个受监督的最新方法上，在两个无约束的数据集上，在两个无约束的数据集上的Egentric 3D身体姿势估计性能得到了显着改善。我们的数据集和代码将用于研究目的。

In this paper, we propose a novel approach to enhance the 3D body pose estimation of a person computed from videos captured from a single wearable camera. The key idea is to leverage high-level features linking first- and third-views in a joint embedding space. To learn such embedding space we introduce First2Third-Pose, a new paired synchronized dataset of nearly 2,000 videos depicting human activities captured from both first- and third-view perspectives. We explicitly consider spatial- and motion-domain features, combined using a semi-Siamese architecture trained in a self-supervised fashion. Experimental results demonstrate that the joint multi-view embedded space learned with our dataset is useful to extract discriminatory features from arbitrary single-view egocentric videos, without needing domain adaptation nor knowledge of camera parameters. We achieve significant improvement of egocentric 3D body pose estimation performance on two unconstrained datasets, over three supervised state-of-the-art approaches. Our dataset and code will be available for research purposes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题