自谐情绪改善了语言模型中的思想推理链

论文标题

自谐情绪改善了语言模型中的思想推理链

Self-Consistency Improves Chain of Thought Reasoning in Language Models

论文作者

Wang, Xuezhi, Wei, Jason, Schuurmans, Dale, Le, Quoc, Chi, Ed, Narang, Sharan, Chowdhery, Aakanksha, Zhou, Denny

论文摘要

经过深思熟虑的促进链与预训练的大型语言模型相结合，为复杂的推理任务带来了令人鼓舞的结果。在本文中，我们提出了一种新的解码策略，即自一致性，以取代在经营链的提示中使用的天真贪婪的解码。首先，它采样了一组各种推理路径，而不仅仅是采用贪婪的路径，然后通过将采样的推理路径边缘化选择最一致的答案。自谐度利用了这样的直觉，即复杂的推理问题通常会接受多种不同的思维方式，从而导致其独特的正确答案。我们广泛的经验评估表明，自我一致性在一系列流行的算术和常识性推理基准上，提高了一系列的提示，包括GSM8K（+17.9％）（+17.9％），SVAMP（+11.0％），Aqua（+12.2％），aqua（+12.2％），策略QA（+6.4％）（+3.4％）（+3.4％）（+3.4％）。

Chain-of-thought prompting combined with pre-trained large language models has achieved encouraging results on complex reasoning tasks. In this paper, we propose a new decoding strategy, self-consistency, to replace the naive greedy decoding used in chain-of-thought prompting. It first samples a diverse set of reasoning paths instead of only taking the greedy one, and then selects the most consistent answer by marginalizing out the sampled reasoning paths. Self-consistency leverages the intuition that a complex reasoning problem typically admits multiple different ways of thinking leading to its unique correct answer. Our extensive empirical evaluation shows that self-consistency boosts the performance of chain-of-thought prompting with a striking margin on a range of popular arithmetic and commonsense reasoning benchmarks, including GSM8K (+17.9%), SVAMP (+11.0%), AQuA (+12.2%), StrategyQA (+6.4%) and ARC-challenge (+3.9%).

下载PDF全文

下载文献需遵守相关版权规定

论文标题