分布式线性可分离计算

论文标题

分布式线性可分离计算

Distributed Linearly Separable Computation

论文作者

Wan, Kai, Sun, Hua, Ji, Mingyue, Caire, Giuseppe

论文摘要

本文提出了分布式计算问题，其中主人要求$ n $分布式工人计算线性可分离的功能。任务功能可以表示为$ k $消息的$ k_c $线性组合，其中每个消息都是一个数据集的函数。我们的目标是找到计算成本（分配给每个工人的未编码数据集的数量）和通信成本（主符号必须下载的符号数）之间的最佳权衡，从而从$ n $ n $ n $ n $ n $ n $工人的答案中恢复任务功能的任何$ n_r $的答案可以具有很高的可能性，在$ k_c $ linear combinations yellimerally yemerally linorly siprilly linorly siprilly linorly comellimelly corminal siprilly comellisry ei.i.ds simporly ei.d.c.在一些足够大的有限领域。该法式问题可以看作是某些现有问题的广义版本，例如分布式梯度编码和分布式线性变换。在本文中，我们考虑了计算成本最低的具体情况，并提出了最佳通信成本的新颖可实现性方案和匡威界限。某些系统参数的可实现性和匡威边界重合；当它们不匹配时，我们证明可实现的分布式计算方案在数据集中广泛使用的“循环分配”方案的约束下是最佳的。我们的结果还表明，当$ k = n $时，沟通成本与Tandon等人提出的最佳分布式梯度编码方案相同。主人从中恢复了$ K $消息的一个线性组合，我们提出的方案可以让主恢复具有高概率的消息的任何其他$ N_R -1 $独立的线性组合。

This paper formulates a distributed computation problem, where a master asks $N$ distributed workers to compute a linearly separable function. The task function can be expressed as $K_c$ linear combinations of $K$ messages, where each message is a function of one dataset. Our objective is to find the optimal tradeoff between the computation cost (number of uncoded datasets assigned to each worker) and the communication cost (number of symbols the master must download), such that from the answers of any $N_r$ out of $N$ workers the master can recover the task function with high probability, where the coefficients of the $K_c$ linear combinations are uniformly i.i.d. over some large enough finite field. The formulated problem can be seen as a generalized version of some existing problems, such as distributed gradient coding and distributed linear transform. In this paper, we consider the specific case where the computation cost is minimum, and propose novel achievability schemes and converse bounds for the optimal communication cost. Achievability and converse bounds coincide for some system parameters; when they do not match, we prove that the achievable distributed computing scheme is optimal under the constraint of a widely used `cyclic assignment' scheme on the datasets. Our results also show that when $K = N$, with the same communication cost as the optimal distributed gradient coding scheme proposed by Tandon et al. from which the master recovers one linear combination of $K$ messages, our proposed scheme can let the master recover any additional $N_r - 1$ independent linear combinations of messages with high probability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题