脱钩的上下文处理，用于上下文增强语言建模

论文标题

脱钩的上下文处理，用于上下文增强语言建模

Decoupled Context Processing for Context Augmented Language Modeling

论文作者

Li, Zonglin, Guo, Ruiqi, Kumar, Sanjiv

论文摘要

语言模型可以通过上下文检索器增强，以合并来自大型外部数据库的知识。通过利用检索到的上下文，神经网络不必在其内部参数中记住大量世界知识，从而导致更好的参数效率，解释性和模块化。在本文中，我们研究了一个简单而有效的体系结构，该体系结构将外部上下文纳入基于解耦编码器解码器体系结构的语言模型中。我们表明，如此简单的体系结构在自动回归语言建模和开放式域问答任务上取得了竞争成果。我们还分析了执行基础上下文转移的建议模型的行为。最后，我们讨论了此类检索增强模型的计算含义。

Language models can be augmented with a context retriever to incorporate knowledge from large external databases. By leveraging retrieved context, the neural network does not have to memorize the massive amount of world knowledge within its internal parameters, leading to better parameter efficiency, interpretability and modularity. In this paper we examined a simple yet effective architecture for incorporating external context into language models based on decoupled Encoder Decoder architecture. We showed that such a simple architecture achieves competitive results on auto-regressive language modeling and open domain question answering tasks. We also analyzed the behavior of the proposed model which performs grounded context transfer. Finally we discussed the computational implications of such retrieval augmented models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题