论文标题
节能文档分类的主题
TopicBERT for Energy Efficient Document Classification
论文作者
论文摘要
先前的研究指出,BERT的计算成本随序列长度增长了四边形,从而导致训练时间更长,GPU记忆限制和碳排放。尽管最近的工作试图在预培训中解决这些可扩展性问题,但这些问题在微调方面也很突出,尤其是对于文档分类等长序列任务。因此,我们的工作着重于优化文档分类微调的计算成本。我们通过在统一的框架中对主题和语言模型的补充学习,名为topicbert。这大大减少了自我注意操作的数量 - 主要的性能瓶颈。因此,我们的型号以$ \ sim40 \%$减少$ co_2 $排放的速度达到了1.4倍($ \ sim40 \%$)的加速,同时保留了5个数据集的$ 99.9 \%$ $ performance。
Prior research notes that BERT's computational cost grows quadratically with sequence length thus leading to longer training times, higher GPU memory constraints and carbon emissions. While recent work seeks to address these scalability issues at pre-training, these issues are also prominent in fine-tuning especially for long sequence tasks like document classification. Our work thus focuses on optimizing the computational cost of fine-tuning for document classification. We achieve this by complementary learning of both topic and language models in a unified framework, named TopicBERT. This significantly reduces the number of self-attention operations - a main performance bottleneck. Consequently, our model achieves a 1.4x ($\sim40\%$) speedup with $\sim40\%$ reduction in $CO_2$ emission while retaining $99.9\%$ performance over 5 datasets.