论文标题
部分可观测时空混沌系统的无模型预测
Iteratively Prompt Pre-trained Language Models for Chain of Thought
论文作者
论文摘要
虽然预训练的语言模型(PLM)将大量的世界知识内化,但它们被证明无法召回这些知识来解决需要复杂和多步推理的任务。类似于人类如何为这些任务发展“思想链”,我们如何为PLM提供这样的能力?在这项工作中,我们探讨了一个迭代提示框架,这是一种新的提示范式,该范式逐渐从PLM中引起了多步推理的相关知识。我们确定了现有提示方法的关键局限性,即它们仅限于具有单个可识别关系/谓词的查询,或者对输入上下文不可知,这使得很难在不同的推理步骤中捕获变异性。我们提出了一个迭代的上下文感知求职者,该提示可以通过学习在当前步骤的上下文上动态合成提示来解决这些局限性。在涉及多步推理的三个数据集上进行的实验显示了迭代方案的有效性和上下文感知的提示器设计。
While Pre-trained Language Models (PLMs) internalize a great amount of world knowledge, they have been shown incapable of recalling these knowledge to solve tasks requiring complex & multi-step reasoning. Similar to how humans develop a "chain of thought" for these tasks, how can we equip PLMs with such abilities? In this work, we explore an iterative prompting framework, a new prompting paradigm which progressively elicits relevant knowledge from PLMs for multi-step inference. We identify key limitations of existing prompting methods, namely they are either restricted to queries with a single identifiable relation/predicate, or being agnostic to input contexts, which makes it difficult to capture variabilities across different inference steps. We propose an iterative context-aware prompter, which addresses these limitations by learning to dynamically synthesize prompts conditioned on the current step's contexts. Experiments on three datasets involving multi-step reasoning show the effectiveness of the iterative scheme and the context-aware prompter design.