Google通用图像嵌入的第三名解决方案

论文标题

Google通用图像嵌入的第三名解决方案

3rd Place Solution for Google Universal Image Embedding

论文作者

Aoki, Nobuaki, Namba, Yasumasa

论文摘要

本文介绍了Kaggle上Google通用图像嵌入竞赛的第三名解决方案。我们使用OpenClip的VIT-H/14作为Arcface的骨干，并在2阶段进行了训练。第一阶段是通过冻结的主链完成的，第二阶段是整个模型训练。我们在私人排行榜上取得了0.692的平均精度 @5。代码可在https://github.com/yasumasanamba/google-universal-image-embedding获得

This paper presents the 3rd place solution to the Google Universal Image Embedding Competition on Kaggle. We use ViT-H/14 from OpenCLIP for the backbone of ArcFace, and trained in 2 stage. 1st stage is done with freezed backbone, and 2nd stage is whole model training. We achieve 0.692 mean Precision @5 on private leaderboard. Code available at https://github.com/YasumasaNamba/google-universal-image-embedding

下载PDF全文

下载文献需遵守相关版权规定

论文标题