与悬挂案件的实体对齐的半约束最佳运输

论文标题

与悬挂案件的实体对齐的半约束最佳运输

Semi-constraint Optimal Transport for Entity Alignment with Dangling Cases

论文作者

Luo, Shengxuan, Cheng, Pengyu, Yu, Sheng

论文摘要

实体对齐（EA）通过识别不同图中的等效实体来合并知识图（kgs），这些实体可以有效地丰富kgs的知识表示。但是，在实践中，不同的公斤通常包括悬挂的实体，在另一个图中找不到对应的实体，这限制了EA方法的性能。为了通过悬挂实体改善EA，我们提出了一种无监督的方法，称为“半约束最佳运输实体对齐”（Sotead）。我们的主要思想是将两个公斤之间的实体对准建模为从一个公斤实体到其他实体的最佳运输问题。首先，我们基于验证的单词嵌入在kgs之间设置伪实体对。然后，我们进行对比度度量学习以获得每个实体对之间的运输成本。最后，我们为每个公斤引入一个虚拟实体，以“对齐”其他公斤的悬挂实体，从而放松优化约束，并导致半约束最佳运输。在实验部分中，我们首先显示了Sotead在常用实体比对数据集上的优越性。此外，为了分析与其他基线的悬挂实体检测的能力，我们构建了一个医学跨语言知识图数据集（Meded），我们的Sotead也达到了最先进的性能。

Entity alignment (EA) merges knowledge graphs (KGs) by identifying the equivalent entities in different graphs, which can effectively enrich knowledge representations of KGs. However, in practice, different KGs often include dangling entities whose counterparts cannot be found in the other graph, which limits the performance of EA methods. To improve EA with dangling entities, we propose an unsupervised method called Semi-constraint Optimal Transport for Entity Alignment in Dangling cases (SoTead). Our main idea is to model the entity alignment between two KGs as an optimal transport problem from one KG's entities to the others. First, we set pseudo entity pairs between KGs based on pretrained word embeddings. Then, we conduct contrastive metric learning to obtain the transport cost between each entity pair. Finally, we introduce a virtual entity for each KG to "align" the dangling entities from the other KGs, which relaxes the optimization constraints and leads to a semi-constraint optimal transport. In the experimental part, we first show the superiority of SoTead on a commonly-used entity alignment dataset. Besides, to analyze the ability for dangling entity detection with other baselines, we construct a medical cross-lingual knowledge graph dataset, MedED, where our SoTead also reaches state-of-the-art performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题