使用变压器的行动质量评估

论文标题

使用变压器的行动质量评估

Action Quality Assessment using Transformers

论文作者

Iyer, Abhay, Alali, Mohammad, Bodala, Hemanth, Vaidya, Sunit

论文摘要

动作质量评估（AQA）是基于视频的应用程序中的一个积极研究问题，由于每帧的得分差异，这是一项具有挑战性的任务。现有方法通过基于卷积的方法来解决此问题，但遭受了有效捕获长期依赖性的局限性。随着变形金刚的最新进展，我们表明它们是传统基于卷积的架构的合适替代品。具体而言，基于变压器的模型可以通过有效捕获长期依赖性，并行化计算并为潜水视频提供更广泛的接受场来解决AQA的任务？为了证明我们提出的体系结构的有效性，我们进行了全面的实验，并达到了竞争性的Spearman相关评分为0.9317。此外，我们还探索了超参数对模型性能的影响，并为利用AQA中的变压器的新途径铺平了。

Action quality assessment (AQA) is an active research problem in video-based applications that is a challenging task due to the score variance per frame. Existing methods address this problem via convolutional-based approaches but suffer from its limitation of effectively capturing long-range dependencies. With the recent advancements in Transformers, we show that they are a suitable alternative to the conventional convolutional-based architectures. Specifically, can transformer-based models solve the task of AQA by effectively capturing long-range dependencies, parallelizing computation, and providing a wider receptive field for diving videos? To demonstrate the effectiveness of our proposed architectures, we conducted comprehensive experiments and achieved a competitive Spearman correlation score of 0.9317. Additionally, we explore the hyperparameters effect on the model's performance and pave a new path for exploiting Transformers in AQA.

下载PDF全文

下载文献需遵守相关版权规定

论文标题