二阶无监督神经依赖性解析

论文标题

二阶无监督神经依赖性解析

Second-Order Unsupervised Neural Dependency Parsing

论文作者

Yang, Songlin, Jiang, Yong, Han, Wenjuan, Tu, Kewei

论文摘要

大多数无监督的依赖性解析器都是基于仅考虑本地亲子信息的一阶概率生成模型。受到二阶监督依赖解析的启发，我们提出了一个不受监督的神经依赖模型的二阶扩展，该模型纳入了祖父母或兄弟姐妹信息。我们还提出了依赖性模型的神经参数化和优化方法的新设计。在二阶模型中，语法规则的数量随着词汇大小的增加而立即增长，因此很难训练可能包含数千个单词的词汇化模型。为了解决这个问题，同时仍然从二阶解析和词汇化中受益，我们使用基于协议的学习框架共同训练二阶不信制模型和一阶词汇化模型。与最近的最新方法相比，多个数据集上的实验显示了我们二阶模型的有效性。我们的联合模型在完整的WSJ测试集上比以前的最新解析器提高了10％

Most of the unsupervised dependency parsers are based on first-order probabilistic generative models that only consider local parent-child information. Inspired by second-order supervised dependency parsing, we proposed a second-order extension of unsupervised neural dependency models that incorporate grandparent-child or sibling information. We also propose a novel design of the neural parameterization and optimization methods of the dependency models. In second-order models, the number of grammar rules grows cubically with the increase of vocabulary size, making it difficult to train lexicalized models that may contain thousands of words. To circumvent this problem while still benefiting from both second-order parsing and lexicalization, we use the agreement-based learning framework to jointly train a second-order unlexicalized model and a first-order lexicalized model. Experiments on multiple datasets show the effectiveness of our second-order models compared with recent state-of-the-art methods. Our joint model achieves a 10% improvement over the previous state-of-the-art parser on the full WSJ test set

下载PDF全文

下载文献需遵守相关版权规定

论文标题