论文标题

使清晰度了解最小化更强:一种稀疏的扰动方法

Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach

论文作者

Mi, Peng, Shen, Li, Ren, Tianhe, Zhou, Yiyi, Sun, Xiaoshuai, Ji, Rongrong, Tao, Dacheng

论文摘要

深层神经网络通常受到复杂和非凸损失景观造成的概括不良。流行的解决方案之一是清晰度感知的最小化(SAM),它通过最小化训练损失的最大变化在增加体重时,可以使损失景观平滑。但是,我们发现SAM在所有参数上的不加区分扰动都是次优的,这也导致过度计算,即将常见优化器(如随机梯度下降(SGD))的开销增加一倍。在本文中,我们提出了一种有效有效的训练计划,以稀疏SAM(SSAM)的形式,该方案通过二进制面具实现了稀疏的扰动。为了获得稀疏面具,我们提供了两种解决方案,这些解决方案分别基于fisher信息和动态稀疏训练。此外,从理论上讲,我们可以证明SSAM可以以与SAM相同的速率收敛,即$ O(\ log t/\ sqrt {t})$。稀疏的山姆不仅具有训练加速的潜力,而且还可以有效地平滑损失景观。关于CIFAR10,CIFAR100和Imagenet-1K的广泛实验结果证实了我们方法对SAM的效率,并且仅通过50%的稀疏性扰动,可以保留甚至更好的性能。代码可在https://github.com/mi-peng/sparse-sharpness-aware-minimization上使用。

Deep neural networks often suffer from poor generalization caused by complex and non-convex loss landscapes. One of the popular solutions is Sharpness-Aware Minimization (SAM), which smooths the loss landscape via minimizing the maximized change of training loss when adding a perturbation to the weight. However, we find the indiscriminate perturbation of SAM on all parameters is suboptimal, which also results in excessive computation, i.e., double the overhead of common optimizers like Stochastic Gradient Descent (SGD). In this paper, we propose an efficient and effective training scheme coined as Sparse SAM (SSAM), which achieves sparse perturbation by a binary mask. To obtain the sparse mask, we provide two solutions which are based onFisher information and dynamic sparse training, respectively. In addition, we theoretically prove that SSAM can converge at the same rate as SAM, i.e., $O(\log T/\sqrt{T})$. Sparse SAM not only has the potential for training acceleration but also smooths the loss landscape effectively. Extensive experimental results on CIFAR10, CIFAR100, and ImageNet-1K confirm the superior efficiency of our method to SAM, and the performance is preserved or even better with a perturbation of merely 50% sparsity. Code is availiable at https://github.com/Mi-Peng/Sparse-Sharpness-Aware-Minimization.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源