大型多标签域分类的伪标签和负反馈学习

论文标题

大型多标签域分类的伪标签和负反馈学习

Pseudo Labeling and Negative Feedback Learning for Large-scale Multi-label Domain Classification

论文作者

Kim, Joo-Kyung, Kim, Young-Bum

论文摘要

在大规模的域分类中，可以通过具有重叠功能的多个域来处理话语。但是，在实践中，每种训练话语都提供了有限数量的地面真实域，同时了解正确的目标标签有助于改善模型性能。在本文中，鉴于每个训练话语的一个地面真实领域，我们认为始终如一地预测以最高信心的领域是培训的额外伪标签。为了减少由于伪标签不正确而导致的预测错误，我们利用具有负面系统响应的话语来减少错误预测的域的信心。评估来自智能对话系统的用户话语，我们表明所提出的方法可显着改善域分类的性能，并通过假设重读。

In large-scale domain classification, an utterance can be handled by multiple domains with overlapped capabilities. However, only a limited number of ground-truth domains are provided for each training utterance in practice while knowing as many as correct target labels is helpful for improving the model performance. In this paper, given one ground-truth domain for each training utterance, we regard domains consistently predicted with the highest confidences as additional pseudo labels for the training. In order to reduce prediction errors due to incorrect pseudo labels, we leverage utterances with negative system responses to decrease the confidences of the incorrectly predicted domains. Evaluating on user utterances from an intelligent conversational system, we show that the proposed approach significantly improves the performance of domain classification with hypothesis reranking.

下载PDF全文

下载文献需遵守相关版权规定

论文标题