端到端语音识别模型的在线持续学习

论文标题

端到端语音识别模型的在线持续学习

Online Continual Learning of End-to-End Speech Recognition Models

论文作者

Yang, Muqiao, Lane, Ian, Watanabe, Shinji

论文摘要

持续学习，也称为终身学习，旨在在可用的新数据中不断学习。尽管对自动语音识别的持续学习的先前研究集中在多个不同语音识别任务跨模型的适应上，但在本文中，我们为\ textit {在线持续学习}提供了一个实验设置，以自动对单个任务进行自动语音识别。特别是关注同一任务的其他培训数据随着时间的推移而逐步可用的情况，我们证明了使用在线梯度情节内存（GEM）方法执行增量模型更新到端到端语音识别模型的有效性。此外，我们表明，通过在线持续学习和选择性抽样策略，我们可以保持准确性，该准确性类似于从头开始验证模型，同时需要大大降低计算成本。我们还通过自我监督学习（SSL）功能验证了我们的方法。

Continual Learning, also known as Lifelong Learning, aims to continually learn from new data as it becomes available. While prior research on continual learning in automatic speech recognition has focused on the adaptation of models across multiple different speech recognition tasks, in this paper we propose an experimental setting for \textit{online continual learning} for automatic speech recognition of a single task. Specifically focusing on the case where additional training data for the same task becomes available incrementally over time, we demonstrate the effectiveness of performing incremental model updates to end-to-end speech recognition models with an online Gradient Episodic Memory (GEM) method. Moreover, we show that with online continual learning and a selective sampling strategy, we can maintain an accuracy that is similar to retraining a model from scratch while requiring significantly lower computation costs. We have also verified our method with self-supervised learning (SSL) features.

下载PDF全文

下载文献需遵守相关版权规定

论文标题