论文标题
Google通用图像嵌入的第三名解决方案
3rd Place Solution for Google Universal Image Embedding
论文作者
论文摘要
本文介绍了Kaggle上Google通用图像嵌入竞赛的第三名解决方案。我们使用OpenClip的VIT-H/14作为Arcface的骨干,并在2阶段进行了训练。第一阶段是通过冻结的主链完成的,第二阶段是整个模型训练。我们在私人排行榜上取得了0.692的平均精度 @5。代码可在https://github.com/yasumasanamba/google-universal-image-embedding获得
This paper presents the 3rd place solution to the Google Universal Image Embedding Competition on Kaggle. We use ViT-H/14 from OpenCLIP for the backbone of ArcFace, and trained in 2 stage. 1st stage is done with freezed backbone, and 2nd stage is whole model training. We achieve 0.692 mean Precision @5 on private leaderboard. Code available at https://github.com/YasumasaNamba/google-universal-image-embedding