为什么神经语言模型仍然需要常识性知识来处理所讨论的语义变化？

论文标题

为什么神经语言模型仍然需要常识性知识来处理所讨论的语义变化？

Why Do Neural Language Models Still Need Commonsense Knowledge to Handle Semantic Variations in Question Answering?

论文作者

Kwon, Sunjae, Kang, Cheongwoong, Han, Jiyeon, Choi, Jaesik

论文摘要

现在，通过复杂的神经网络模型（例如掩盖神经语言模型（MNLMS））学习了许多上下文化的单词表示形式，这些模型由巨大的神经网络结构组成，并经过训练以恢复蒙面文本。这样的表示表明在某些阅读理解（RC）任务中表现出超人的表现，这些任务在给定问题的上下文中提取了适当的答案。但是，由于许多模型参数，确定在MNLMS中训练的详细知识具有挑战性。本文提供了有关MNLMS中包含的常识性知识的新见解和经验分析。首先，我们使用诊断测试来评估常识性知识是否在MNLMS中进行了适当的培训。我们观察到，在MNLMS中没有适当地培训大量常识性知识，而MNLMS并不经常准确地理解关系的语义含义。此外，我们发现基于MNLM的RC模型仍然容易受到需要常识性知识的语义变化的影响。最后，我们发现了未经训练的知识的基本原因。我们进一步建议，利用外常识性知识存储库可以是一个有效的解决方案。我们体现了克服基于MNLM的RC模型局限性的可能性，它通过在受控实验中丰富了来自外常识性知识存储库所需的知识的文本。

Many contextualized word representations are now learned by intricate neural network models, such as masked neural language models (MNLMs) which are made up of huge neural network structures and trained to restore the masked text. Such representations demonstrate superhuman performance in some reading comprehension (RC) tasks which extract a proper answer in the context given a question. However, identifying the detailed knowledge trained in MNLMs is challenging owing to numerous and intermingled model parameters. This paper provides new insights and empirical analyses on commonsense knowledge included in pretrained MNLMs. First, we use a diagnostic test that evaluates whether commonsense knowledge is properly trained in MNLMs. We observe that a large proportion of commonsense knowledge is not appropriately trained in MNLMs and MNLMs do not often understand the semantic meaning of relations accurately. In addition, we find that the MNLM-based RC models are still vulnerable to semantic variations that require commonsense knowledge. Finally, we discover the fundamental reason why some knowledge is not trained. We further suggest that utilizing an external commonsense knowledge repository can be an effective solution. We exemplify the possibility to overcome the limitations of the MNLM-based RC models by enriching text with the required knowledge from an external commonsense knowledge repository in controlled experiments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题