论文标题
部分可观测时空混沌系统的无模型预测
Symmetry Defense Against CNN Adversarial Perturbation Attacks
论文作者
论文摘要
本文使用对称性来使卷积神经网络分类器(CNN)与对抗扰动攻击进行健壮。这种攻击增加了原始图像的扰动,以产生对抗性图像,使分类器(例如自动驾驶汽车的路标分类器)。尽管对称性是自然世界的普遍方面,但CNN无法很好地处理对称性。例如,CNN可以将图像与镜像图像不同。对于以错误的标签$ l_w $分类的对抗图像,CNN无法处理对称性,这意味着对称对手图像可以与错误的标签$ L_W $分类不同。除此之外,我们发现对称对抗图像的分类还原为正确的标签。为了在对手不知道防御时对图像进行分类,我们将对称性应用于图像并使用对称图像的分类标签。为了在对手意识到防御的情况下对图像进行分类,我们使用镜像对称和像素反面对称性来形成对称组。我们将所有组对称性应用于图像,并根据对称图像的任何两个分类标签的一致性决定输出标签。自适应攻击之所以失败,是因为他们需要依靠使用冲突的CNN输出值作为对称图像的损失功能。没有攻击知识,提议的对称防御能够抵抗基于梯度的和随机搜索的攻击,最多可用于Imagenet。防御甚至提高了原始图像的分类精度。
This paper uses symmetry to make Convolutional Neural Network classifiers (CNNs) robust against adversarial perturbation attacks. Such attacks add perturbation to original images to generate adversarial images that fool classifiers such as road sign classifiers of autonomous vehicles. Although symmetry is a pervasive aspect of the natural world, CNNs are unable to handle symmetry well. For example, a CNN can classify an image differently from its mirror image. For an adversarial image that misclassifies with a wrong label $l_w$, CNN inability to handle symmetry means that a symmetric adversarial image can classify differently from the wrong label $l_w$. Further than that, we find that the classification of a symmetric adversarial image reverts to the correct label. To classify an image when adversaries are unaware of the defense, we apply symmetry to the image and use the classification label of the symmetric image. To classify an image when adversaries are aware of the defense, we use mirror symmetry and pixel inversion symmetry to form a symmetry group. We apply all the group symmetries to the image and decide on the output label based on the agreement of any two of the classification labels of the symmetry images. Adaptive attacks fail because they need to rely on loss functions that use conflicting CNN output values for symmetric images. Without attack knowledge, the proposed symmetry defense succeeds against both gradient-based and random-search attacks, with up to near-default accuracies for ImageNet. The defense even improves the classification accuracy of original images.