论文标题
WAV2VEC-AUG:有限的数据改进了自我监督的培训
Wav2Vec-Aug: Improved self-supervised training with limited data
论文作者
论文摘要
在过去的几年中,语音表征的自我监督学习(SSL)受到了很多关注,但大多数工作都集中在具有大量未标记数据的语言和领域上。但是,对于许多语言,即使在未标记的数据中也存在短缺,这限制了SSL的有效性。在这项工作中,我们专注于通过利用WAV2VEC 2.0预处理的数据增强来将SSL应用于域具有有限数据的域的问题。此外,我们建议对模型的每个组件进行改进,从而与LibrisPeech Test-Clean / other上的WAV2VEC 2.0相比,将相对单词误差率(WER)提高高达13%。
Self-supervised learning (SSL) of speech representations has received much attention over the last few years but most work has focused on languages and domains with an abundance of unlabeled data. However, for many languages there is a shortage even in the unlabeled data which limits the effectiveness of SSL. In this work, we focus on the problem of applying SSL to domains with limited available data by leveraging data augmentation for Wav2Vec 2.0 pretraining. Further, we propose improvements to each component of the model which result in a combined relative word error rate (WER) improvement of up to 13% compared to Wav2Vec 2.0 on Librispeech test-clean / other.