论文标题
数据有效的端到端口语理解体系结构
A Data Efficient End-To-End Spoken Language Understanding Architecture
论文作者
论文摘要
最近已经提出了端到端的体系结构,以进行语言理解(SLU)和语义解析。基于大量数据,这些模型可以学习共同的声学和语言序列特征。这些体系结构在域,意图和插槽检测的背景下给出了很好的结果,它们在更复杂的语义块和标记任务中的应用不那么容易。为此,在许多情况下,模型与外部语言模型相结合以提高其性能。 在本文中,我们介绍了一个经过训练的端到端的数据效率系统,没有其他预先训练的外部模块。我们方法的一个关键特征是一个增量训练程序,在该程序中,声学,语言和语义模型是依次又一个接一个地训练的。提出的模型具有合理的规模,并在使用小型培训数据集的同时,就最先进的结果取得了竞争成果。特别是,在媒体/火车上培训时,我们在媒体/测试中达到24.02%的概念错误率(CER),而无需任何其他数据。
End-to-end architectures have been recently proposed for spoken language understanding (SLU) and semantic parsing. Based on a large amount of data, those models learn jointly acoustic and linguistic-sequential features. Such architectures give very good results in the context of domain, intent and slot detection, their application in a more complex semantic chunking and tagging task is less easy. For that, in many cases, models are combined with an external language model to enhance their performance. In this paper we introduce a data efficient system which is trained end-to-end, with no additional, pre-trained external module. One key feature of our approach is an incremental training procedure where acoustic, language and semantic models are trained sequentially one after the other. The proposed model has a reasonable size and achieves competitive results with respect to state-of-the-art while using a small training dataset. In particular, we reach 24.02% Concept Error Rate (CER) on MEDIA/test while training on MEDIA/train without any additional data.