论文标题
“无证件工人”与“非法外国人”相同?在矢量空间中解开表示和内涵
Are "Undocumented Workers" the Same as "Illegal Aliens"? Disentangling Denotation and Connotation in Vector Spaces
论文作者
论文摘要
在政治上,新的神学主义经常是针对党派目标发明的。例如,“无证件工人”和“非法外国人”是指同一人群(即,他们具有相同的含义),但他们具有明显不同的含义。传统上,此类示例对基于参考的语义理论提出了挑战,并导致哲学家和认知科学家对替代理论(例如,两因素语义)的接受程度越来越高。然而,在NLP中,流行的经过预处理的模型将表示和含义编码为一个纠缠表示。在这项研究中,我们提出了一个对抗性神经网络,该网络将预估计的表示形式分解为独立的表示和内涵表示。对于内在的解释性,我们表明具有相同表示的单词但不同的含义(例如,“移民”与“外星人”,“遗产税”与“死亡税”)在表示空间中彼此之间彼此之间的移动,同时在含义空间中进一步移动。对于外部应用,我们使用分离的表示系统培训信息检索系统,并表明该表示向量改善了文档排名的观点多样性。
In politics, neologisms are frequently invented for partisan objectives. For example, "undocumented workers" and "illegal aliens" refer to the same group of people (i.e., they have the same denotation), but they carry clearly different connotations. Examples like these have traditionally posed a challenge to reference-based semantic theories and led to increasing acceptance of alternative theories (e.g., Two-Factor Semantics) among philosophers and cognitive scientists. In NLP, however, popular pretrained models encode both denotation and connotation as one entangled representation. In this study, we propose an adversarial neural network that decomposes a pretrained representation as independent denotation and connotation representations. For intrinsic interpretability, we show that words with the same denotation but different connotations (e.g., "immigrants" vs. "aliens", "estate tax" vs. "death tax") move closer to each other in denotation space while moving further apart in connotation space. For extrinsic application, we train an information retrieval system with our disentangled representations and show that the denotation vectors improve the viewpoint diversity of document rankings.