强大的模型不太自信

论文标题

强大的模型不太自信

Robust Models are less Over-Confident

论文作者

Grabinski, Julia, Gavrikov, Paul, Keuper, Janis, Keuper, Margret

论文摘要

尽管卷积神经网络（CNN）在许多用于计算机视觉任务的学术基准中取得了成功，但它们在现实世界中的应用仍面临着基本挑战。这些开放问题之一是固有的缺乏鲁棒性，这是由于对抗性攻击的惊人有效性所揭示的。当前的攻击方法能够通过在输入中添加特定但少量的噪声来操纵网络的预测。反过来，对抗性训练（AT）旨在通过在培训集中包括对抗样本来实现这种攻击的鲁棒性，并理想地具有更好的模型概括能力。但是，对对抗性鲁棒性以外的鲁棒模型的深入分析仍在等待中。在本文中，我们通过经验分析了各种经过对抗训练的模型，这些模型在面对最新的攻击时获得了高度稳健的精度，并且我们表明，AT具有有趣的副作用：它导致模型对他们的决策的过度自信，即使是在干净的数据上，也比非命令模型相比。此外，我们对鲁棒模型的分析表明，不仅在AT，而且模型的构件（例如激活功能和合并）对模型的预测信心也有很大的影响。数据和项目网站：https：//github.com/gejulia/robustness_confidences_evaluation

Despite the success of convolutional neural networks (CNNs) in many academic benchmarks for computer vision tasks, their application in the real-world is still facing fundamental challenges. One of these open problems is the inherent lack of robustness, unveiled by the striking effectiveness of adversarial attacks. Current attack methods are able to manipulate the network's prediction by adding specific but small amounts of noise to the input. In turn, adversarial training (AT) aims to achieve robustness against such attacks and ideally a better model generalization ability by including adversarial samples in the trainingset. However, an in-depth analysis of the resulting robust models beyond adversarial robustness is still pending. In this paper, we empirically analyze a variety of adversarially trained models that achieve high robust accuracies when facing state-of-the-art attacks and we show that AT has an interesting side-effect: it leads to models that are significantly less overconfident with their decisions, even on clean data than non-robust models. Further, our analysis of robust models shows that not only AT but also the model's building blocks (like activation functions and pooling) have a strong influence on the models' prediction confidences. Data & Project website: https://github.com/GeJulia/robustness_confidences_evaluation

下载PDF全文

下载文献需遵守相关版权规定

论文标题