论文标题
减少灾难性忘记对远程遗传的密集检索训练
Reduce Catastrophic Forgetting of Dense Retrieval Training with Teleportation Negatives
论文作者
论文摘要
在本文中,我们调查了标准密集检索训练的不稳定性,该训练在模型训练和使用受训练的模型之间进行了迭代。我们显示了训练不稳定性背后的灾难性遗忘现象,在训练迭代期间,模型学习并忘记了不同的负面群体。然后,我们提出了Ance-Tele,从过去的迭代中积累了动量负面影响,并使用LookAhead负面质量近似将来的迭代,作为时间轴的“传送”以平滑学习过程。在Web搜索和OpenQA上,Ance-Tele优于以前具有相似大小的先前最先进的系统,消除了对稀疏检索负面负面的依赖性,并且在系统之间使用更多(50x)参数的系统具有竞争力。我们的分析表明,传送负面因素降低了灾难性的遗忘,并提高了致密检索训练的收敛速度。我们的代码可在https://github.com/openmatch/ance-tele上找到。
In this paper, we investigate the instability in the standard dense retrieval training, which iterates between model training and hard negative selection using the being-trained model. We show the catastrophic forgetting phenomena behind the training instability, where models learn and forget different negative groups during training iterations. We then propose ANCE-Tele, which accumulates momentum negatives from past iterations and approximates future iterations using lookahead negatives, as "teleportations" along the time axis to smooth the learning process. On web search and OpenQA, ANCE-Tele outperforms previous state-of-the-art systems of similar size, eliminates the dependency on sparse retrieval negatives, and is competitive among systems using significantly more (50x) parameters. Our analysis demonstrates that teleportation negatives reduce catastrophic forgetting and improve convergence speed for dense retrieval training. Our code is available at https://github.com/OpenMatch/ANCE-Tele.