论文标题
转换,压缩,正确:朝着沟通效率的DNN培训的三个步骤
Convert, compress, correct: Three steps toward communication-efficient DNN training
论文作者
论文摘要
在本文中,我们介绍了一种新颖的算法,$ \ mathsf {co} _3 $,用于沟通效率分布的深神经网络(DNN)培训。 $ \ Mathsf {Co} _3 $是一种联合培训/通信协议,它涵盖了网络梯度的三个处理步骤:(i)通过浮点转换量化,(ii)无损压缩和(iii)误差校正。这三个组成部分对于实施分布式DNN培训对率受限制的链接至关重要。处理DNN梯度的这三个步骤的相互作用仔细平衡,以产生强大而高性能的方案。通过对CIFAR-10的数值评估,研究了所提出的方案的性能。
In this paper, we introduce a novel algorithm, $\mathsf{CO}_3$, for communication-efficiency distributed Deep Neural Network (DNN) training. $\mathsf{CO}_3$ is a joint training/communication protocol, which encompasses three processing steps for the network gradients: (i) quantization through floating-point conversion, (ii) lossless compression, and (iii) error correction. These three components are crucial in the implementation of distributed DNN training over rate-constrained links. The interplay of these three steps in processing the DNN gradients is carefully balanced to yield a robust and high-performance scheme. The performance of the proposed scheme is investigated through numerical evaluations over CIFAR-10.