了解模型大小如何影响很少的射击指令提示

论文标题

了解模型大小如何影响很少的射击指令提示

Understanding How Model Size Affects Few-shot Instruction Prompting

论文作者

Joaquin, Ayrton San, Haroen, Ardy

论文摘要

大型语言模型受到记忆和忘记培训数据的现象的影响。但是，这些如何因模型大小而异？我们通过研究模型大小如何影响模型在给定上下文中区分单词含义的能力来解决这个问题。我们介绍了一个名为Deltawords的数据集，该数据集评估了模型遵循说明的能力，以选择一个句子，该句子用其反义词替换目标单词。我们显示出较弱的逆缩放趋势，在几乎没有射击的促使机制下，任务准确性随着模型大小的增加而降低。我们表明，增加示例的数量往往比较小的模型往往使更大的模型受益。

Large Language Models are affected by the phenomena of memorizing and forgetting their training data. But how do these vary by model size? We work towards this question by investigating how the model size affects the model's ability to discriminate a word's meaning in a given context. We introduce a dataset called DeltaWords, which evaluates a model's ability to follow instructions to select a sentence which replaces the target word with its antonym. We show a weak inverse scaling trend, where task accuracy degrades as model size increase, under extremely few-shot prompting regimes. We show that increasing the number of examples tend to disproportionately benefit larger models than smaller models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题