部分可观测时空混沌系统的无模型预测

论文标题

部分可观测时空混沌系统的无模型预测

Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts

论文作者

Ryabinin, Max, Gusev, Anton

论文摘要

Many recent breakthroughs in deep learning were achieved by training increasingly larger models on massive datasets. However, training such models can be prohibitively expensive. For instance, the cluster used to train GPT-3 costs over \$250 million. As a result, most researchers cannot afford to train state of the art models and contribute to their development. Hypothetically, a researcher could crowdsource the training of large neural networks with thousands of regular PCs provided by volunteers. $ 2500的台式机的原始计算能力使\ 2.5亿美元的服务器吊舱相形见warfs，但通过常规的分布式培训方法，无法有效利用该功能。在这项工作中，我们建议Learning@Home：一种新型的神经网络训练范式，旨在处理大量连接的参与者。我们分析了该范式的性能，可靠性和建筑约束，并将其与现有的分布式培训技术进行比较。

Many recent breakthroughs in deep learning were achieved by training increasingly larger models on massive datasets. However, training such models can be prohibitively expensive. For instance, the cluster used to train GPT-3 costs over \$250 million. As a result, most researchers cannot afford to train state of the art models and contribute to their development. Hypothetically, a researcher could crowdsource the training of large neural networks with thousands of regular PCs provided by volunteers. The raw computing power of a hundred thousand \$2500 desktops dwarfs that of a \$250M server pod, but one cannot utilize that power efficiently with conventional distributed training methods. In this work, we propose Learning@home: a novel neural network training paradigm designed to handle large amounts of poorly connected participants. We analyze the performance, reliability, and architectural constraints of this paradigm and compare it against existing distributed training techniques.

下载PDF全文

下载文献需遵守相关版权规定

论文标题