排列的ADAIN：减少图像分类中全球统计数据的偏见

论文标题

排列的ADAIN：减少图像分类中全球统计数据的偏见

Permuted AdaIN: Reducing the Bias Towards Global Statistics in Image Classification

论文作者

Nuriel, Oren, Benaim, Sagie, Wolf, Lior

论文摘要

最近的工作表明，卷积神经网络分类器过于依赖纹理，而以形状提示为代价。一方面，我们在形状和局部图像提示之间以及另一方面的全局图像统计信息在形状和局部图像提示之间做出了类似但不同的区别。我们的方法称为置换的自适应实例归一化（PADAIN），减少了图像分类器隐藏层中全局统计的表示。 Padain采样了一个随机排列$π$，该$π$重新安排给定批处理中的样品。然后在每个（非渗透）样本$ i $的激活与样本$π（i）$的相应激活之间应用自适应实例归一化（ADAIN），从而在批处理样本之间交换统计信息。由于全局图像统计信息被扭曲，因此此交换过程会导致网络依靠线索，例如形状或纹理。通过概率$ p $选择随机排列，否则可以控制效果的强度。通过正确选择$ p $，为所有实验固定了APRIORI，并且在不考虑测试数据的情况下选择了APRIORI，我们的方法在多个设置中始终优于基准。在图像分类中，我们的方法使用多个体系结构在CIFAR100和ImaTenet上都改进。在鲁棒性的环境中，对于多个架构，我们的方法在Imagenet-C和Cifar-100-C上都改进了。在域的适应和域概括的设置中，我们的方法在从GTAV到CityScapes和PACS基准的转移学习任务上实现了最先进的结果。

Recent work has shown that convolutional neural network classifiers overly rely on texture at the expense of shape cues. We make a similar but different distinction between shape and local image cues, on the one hand, and global image statistics, on the other. Our method, called Permuted Adaptive Instance Normalization (pAdaIN), reduces the representation of global statistics in the hidden layers of image classifiers. pAdaIN samples a random permutation $π$ that rearranges the samples in a given batch. Adaptive Instance Normalization (AdaIN) is then applied between the activations of each (non-permuted) sample $i$ and the corresponding activations of the sample $π(i)$, thus swapping statistics between the samples of the batch. Since the global image statistics are distorted, this swapping procedure causes the network to rely on cues, such as shape or texture. By choosing the random permutation with probability $p$ and the identity permutation otherwise, one can control the effect's strength. With the correct choice of $p$, fixed apriori for all experiments and selected without considering test data, our method consistently outperforms baselines in multiple settings. In image classification, our method improves on both CIFAR100 and ImageNet using multiple architectures. In the setting of robustness, our method improves on both ImageNet-C and Cifar-100-C for multiple architectures. In the setting of domain adaptation and domain generalization, our method achieves state of the art results on the transfer learning task from GTAV to Cityscapes and on the PACS benchmark.

下载PDF全文

下载文献需遵守相关版权规定

论文标题