通过层间内核共享大大减少深CNN中可训练的参数的数量

论文标题

通过层间内核共享大大减少深CNN中可训练的参数的数量

Drastically Reducing the Number of Trainable Parameters in Deep CNNs by Inter-layer Kernel-sharing

论文作者

Azadbakht, Alireza, Kheradpisheh, Saeed Reza, Khalfaoui-Hassani, Ismail, Masquelier, Timothée

论文摘要

深度卷积神经网络（DCNN）已成为许多计算机视觉任务的最新方法（SOTA）方法：图像分类，对象检测，语义细分等。但是，大多数SOTA网络对于边缘计算而言太大。在这里，我们建议一种简单的方法来减少可训练参数的数量，从而减少内存足迹：在多个卷积层之间共享内核。 Kernel-sharing is only possible between ``isomorphic" layers, i.e.layers having the same kernel size, input and output channels. This is typically the case inside each stage of a DCNN. Our experiments on CIFAR-10 and CIFAR-100, using the ConvMixer and SE-ResNet architectures show that the number of parameters of these models can drastically be reduced with minimal cost on accuracy.最终的网络吸引了某些受严重内存限制的边缘计算应用程序，如果利用“冷冻权重”硬件加速器，则更有趣。

Deep convolutional neural networks (DCNNs) have become the state-of-the-art (SOTA) approach for many computer vision tasks: image classification, object detection, semantic segmentation, etc. However, most SOTA networks are too large for edge computing. Here, we suggest a simple way to reduce the number of trainable parameters and thus the memory footprint: sharing kernels between multiple convolutional layers. Kernel-sharing is only possible between ``isomorphic" layers, i.e.layers having the same kernel size, input and output channels. This is typically the case inside each stage of a DCNN. Our experiments on CIFAR-10 and CIFAR-100, using the ConvMixer and SE-ResNet architectures show that the number of parameters of these models can drastically be reduced with minimal cost on accuracy. The resulting networks are appealing for certain edge computing applications that are subject to severe memory constraints, and even more interesting if leveraging "frozen weights" hardware accelerators. Kernel-sharing is also an efficient regularization method, which can reduce overfitting. The codes are publicly available at https://github.com/AlirezaAzadbakht/kernel-sharing.

下载PDF全文

下载文献需遵守相关版权规定

论文标题