论文标题
自谐情绪改善了语言模型中的思想推理链
Self-Consistency Improves Chain of Thought Reasoning in Language Models
论文作者
论文摘要
经过深思熟虑的促进链与预训练的大型语言模型相结合,为复杂的推理任务带来了令人鼓舞的结果。在本文中,我们提出了一种新的解码策略,即自一致性,以取代在经营链的提示中使用的天真贪婪的解码。首先,它采样了一组各种推理路径,而不仅仅是采用贪婪的路径,然后通过将采样的推理路径边缘化选择最一致的答案。自谐度利用了这样的直觉,即复杂的推理问题通常会接受多种不同的思维方式,从而导致其独特的正确答案。我们广泛的经验评估表明,自我一致性在一系列流行的算术和常识性推理基准上,提高了一系列的提示,包括GSM8K(+17.9%)(+17.9%),SVAMP(+11.0%),Aqua(+12.2%),aqua(+12.2%),策略QA(+6.4%)(+3.4%)(+3.4%)(+3.4%)。
Chain-of-thought prompting combined with pre-trained large language models has achieved encouraging results on complex reasoning tasks. In this paper, we propose a new decoding strategy, self-consistency, to replace the naive greedy decoding used in chain-of-thought prompting. It first samples a diverse set of reasoning paths instead of only taking the greedy one, and then selects the most consistent answer by marginalizing out the sampled reasoning paths. Self-consistency leverages the intuition that a complex reasoning problem typically admits multiple different ways of thinking leading to its unique correct answer. Our extensive empirical evaluation shows that self-consistency boosts the performance of chain-of-thought prompting with a striking margin on a range of popular arithmetic and commonsense reasoning benchmarks, including GSM8K (+17.9%), SVAMP (+11.0%), AQuA (+12.2%), StrategyQA (+6.4%) and ARC-challenge (+3.9%).