论文标题
多代理目标获取中的紧急行为
Emergent Behaviors in Multi-Agent Target Acquisition
论文作者
论文摘要
只有有限的研究和表面评估才能对代理在多代理系统(MAS)中的行为和角色进行。我们在追求逃避(又名Predator-Prey Pursuit)游戏中使用强化学习(RL)模拟MAS,该游戏与目标获取共享任务目标,我们通过更换两种不同的(非RL)分析策略来替换RL训练的追捕者政策,从而创建不同的对抗性场景。随着时间的推移,使用代理位置(状态空间变量)的热图,我们能够对RL训练的逃避者的行为进行分类。我们方法的新颖性需要创建一个有影响力的功能集,该特征集揭示了潜在的数据规律性,从而使我们能够对代理人的行为进行分类。这种分类可以通过使我们能够识别和预测其行为来帮助捕获(敌人)目标,并且在扩展到追随者时,这种识别队友行为的方法可能使代理人可以更有效地协调。
Only limited studies and superficial evaluations are available on agents' behaviors and roles within a Multi-Agent System (MAS). We simulate a MAS using Reinforcement Learning (RL) in a pursuit-evasion (a.k.a predator-prey pursuit) game, which shares task goals with target acquisition, and we create different adversarial scenarios by replacing RL-trained pursuers' policies with two distinct (non-RL) analytical strategies. Using heatmaps of agents' positions (state-space variable) over time, we are able to categorize an RL-trained evader's behaviors. The novelty of our approach entails the creation of an influential feature set that reveals underlying data regularities, which allow us to classify an agent's behavior. This classification may aid in catching the (enemy) targets by enabling us to identify and predict their behaviors, and when extended to pursuers, this approach towards identifying teammates' behavior may allow agents to coordinate more effectively.