综合深度强化学习框架的审查，分析和设计

论文标题

综合深度强化学习框架的审查，分析和设计

Review, Analysis and Design of a Comprehensive Deep Reinforcement Learning Framework

论文作者

Nguyen, Ngoc Duy, Nguyen, Thanh Thi, Nguyen, Hai, Creighton, Doug, Nahavandi, Saeid

论文摘要

深度学习与加强学习（RL）的整合使RL能够在高维环境中有效地执行。近年来，已经采用了深层RL方法来解决许多复杂的现实世界问题。但是，基于深度RL的系统的开发是具有挑战性的，因为诸如选择合适的深度RL算法，其网络配置，培训时间，培训方法等。本文提出了一个综合的软件框架，该框架不仅在设计连接点深度RL体系结构中起着至关重要的作用，而且还提供了在短时间内开发现实的RL应用程序的指南。我们设计并开发了一个基于RL的深层软件框架，该框架严格确保了灵活性，鲁棒性和可扩展性。通过继承所提出的体系结构，软件经理可以在设计基于RL的深度系统时预见到任何挑战。结果，他们可以加快设计过程并积极控制软件开发的每个阶段，这在敏捷开发环境中尤其重要。为了实施概括，所提出的体系结构不取决于特定的RL算法，网络配置，代理数量或代理类型。使用我们的框架，软件开发人员可以开发和集成新的RL算法或新类型的代理，并且可以灵活地更改网络配置或代理数量。

The integration of deep learning to reinforcement learning (RL) has enabled RL to perform efficiently in high-dimensional environments. Deep RL methods have been applied to solve many complex real-world problems in recent years. However, development of a deep RL-based system is challenging because of various issues such as the selection of a suitable deep RL algorithm, its network configuration, training time, training methods, and so on. This paper proposes a comprehensive software framework that not only plays a vital role in designing a connect-the-dots deep RL architecture but also provides a guideline to develop a realistic RL application in a short time span. We have designed and developed a deep RL-based software framework that strictly ensures flexibility, robustness, and scalability. By inheriting the proposed architecture, software managers can foresee any challenges when designing a deep RL-based system. As a result, they can expedite the design process and actively control every stage of software development, which is especially critical in agile development environments. To enforce generalization, the proposed architecture does not depend on a specific RL algorithm, a network configuration, the number of agents, or the type of agents. Using our framework, software developers can develop and integrate new RL algorithms or new types of agents, and can flexibly change network configuration or the number of agents.

下载PDF全文

下载文献需遵守相关版权规定

论文标题