关于2048年游戏的加强学习

论文标题

关于2048年游戏的加强学习

On Reinforcement Learning for the Game of 2048

论文作者

Guei, Hung

论文摘要

2048年是单人随机益智游戏。这款有趣且令人上瘾的游戏在全球范围内流行，并吸引了研究人员开发游戏节目。由于其简单性和复杂性，2048年已成为评估机器学习方法有效性的有趣且具有挑战性的平台。 This dissertation conducts comprehensive research on reinforcement learning and computer game algorithms for 2048. First, this dissertation proposes optimistic temporal difference learning, which significantly improves the quality of learning by employing optimistic initialization to encourage exploration for 2048. Furthermore, based on this approach, a state-of-the-art program for 2048 is developed, which achieves the highest performance among all learning-based programs, namely an average score of 625377 points达到32768英尺的速率为72％。其次，本论文研究了与2048年有关的几种技术，包括N-tuple网络集合学习，蒙特卡洛树搜索和深度强化学习。这些技术有望进一步提高当前最新计划的性能。最后，本论文通过提出课程设计并总结教学经验来讨论与2048年有关的教学应用。拟议的课程设计采用2048年的游戏作为初学者学习强化学习和计算机游戏算法的材料。这些课程已成功地应用于研究生级学生，并通过学生反馈良好。

2048 is a single-player stochastic puzzle game. This intriguing and addictive game has been popular worldwide and has attracted researchers to develop game-playing programs. Due to its simplicity and complexity, 2048 has become an interesting and challenging platform for evaluating the effectiveness of machine learning methods. This dissertation conducts comprehensive research on reinforcement learning and computer game algorithms for 2048. First, this dissertation proposes optimistic temporal difference learning, which significantly improves the quality of learning by employing optimistic initialization to encourage exploration for 2048. Furthermore, based on this approach, a state-of-the-art program for 2048 is developed, which achieves the highest performance among all learning-based programs, namely an average score of 625377 points and a rate of 72% for reaching 32768-tiles. Second, this dissertation investigates several techniques related to 2048, including the n-tuple network ensemble learning, Monte Carlo tree search, and deep reinforcement learning. These techniques are promising for further improving the performance of the current state-of-the-art program. Finally, this dissertation discusses pedagogical applications related to 2048 by proposing course designs and summarizing the teaching experience. The proposed course designs adopt 2048-like games as materials for beginners to learn reinforcement learning and computer game algorithms. The courses have been successfully applied to graduate-level students and received well by student feedback.

下载PDF全文

下载文献需遵守相关版权规定

论文标题