论文标题
使用变压器的行动质量评估
Action Quality Assessment using Transformers
论文作者
论文摘要
动作质量评估(AQA)是基于视频的应用程序中的一个积极研究问题,由于每帧的得分差异,这是一项具有挑战性的任务。现有方法通过基于卷积的方法来解决此问题,但遭受了有效捕获长期依赖性的局限性。随着变形金刚的最新进展,我们表明它们是传统基于卷积的架构的合适替代品。具体而言,基于变压器的模型可以通过有效捕获长期依赖性,并行化计算并为潜水视频提供更广泛的接受场来解决AQA的任务?为了证明我们提出的体系结构的有效性,我们进行了全面的实验,并达到了竞争性的Spearman相关评分为0.9317。此外,我们还探索了超参数对模型性能的影响,并为利用AQA中的变压器的新途径铺平了。
Action quality assessment (AQA) is an active research problem in video-based applications that is a challenging task due to the score variance per frame. Existing methods address this problem via convolutional-based approaches but suffer from its limitation of effectively capturing long-range dependencies. With the recent advancements in Transformers, we show that they are a suitable alternative to the conventional convolutional-based architectures. Specifically, can transformer-based models solve the task of AQA by effectively capturing long-range dependencies, parallelizing computation, and providing a wider receptive field for diving videos? To demonstrate the effectiveness of our proposed architectures, we conducted comprehensive experiments and achieved a competitive Spearman correlation score of 0.9317. Additionally, we explore the hyperparameters effect on the model's performance and pave a new path for exploiting Transformers in AQA.