通过选择维度的固有探测

论文标题

通过选择维度的固有探测

Intrinsic Probing through Dimension Selection

论文作者

Hennigen, Lucas Torroba, Williams, Adina, Cotterell, Ryan

论文摘要

大多数现代的NLP系统都利用了预先训练的上下文表示，这些表示在各种任务上都具有惊人的高性能。除非某种形式的语言结构在这些表示形式中继承，否则这种高性能是不可能的，并且在探测它方面已经大量研究。在本文中，我们在固有探测之间进行了区分，该概述研究了语言信息在表示内的结构，以及在先前工作中流行的外部探测，这仅通过证明可以成功提取的信息来表明存在此类信息。为了启用固有的探测，我们提出了一个基于可分解的多元高斯探测器的新框架，使我们能够确定单词嵌入中的语言信息是分散还是焦点。然后，我们探测FastText和Bert的36种语言中的各种形态句法属性。我们发现，大多数属性仅由几个神经元可靠地编码，而FastText比Bert更集中其语言结构。

Most modern NLP systems make use of pre-trained contextual representations that attain astonishingly high performance on a variety of tasks. Such high performance should not be possible unless some form of linguistic structure inheres in these representations, and a wealth of research has sprung up on probing for it. In this paper, we draw a distinction between intrinsic probing, which examines how linguistic information is structured within a representation, and the extrinsic probing popular in prior work, which only argues for the presence of such information by showing that it can be successfully extracted. To enable intrinsic probing, we propose a novel framework based on a decomposable multivariate Gaussian probe that allows us to determine whether the linguistic information in word embeddings is dispersed or focal. We then probe fastText and BERT for various morphosyntactic attributes across 36 languages. We find that most attributes are reliably encoded by only a few neurons, with fastText concentrating its linguistic structure more than BERT.

下载PDF全文

下载文献需遵守相关版权规定

论文标题