论文标题
在游戏中进行多机构学习的有限时间最后卷融合
Finite-Time Last-Iterate Convergence for Multi-Agent Learning in Games
论文作者
论文摘要
在本文中,我们在一类称为$λ$ - cocoercive游戏的游戏中考虑了通过在线渐变下降的多代理学习,这是一款相当广泛的游戏,可以接受许多Nash Equilibria,并且适当地包含了不受限制的强烈单调游戏。我们表征了$λ$ cocoercive游戏的联合OGD学习的有限时间的最后近期收敛率;此外,在此结果的基础上,我们开发了一种完全自适应的OGD学习算法,该算法不需要任何问题参数了解(例如,cocoercive constant $λ$),并通过一种新颖的双重播放时间技术来显示这种自适应算法可以实现同一有限的时间最后术语的逆转速率,因为它们是非自适应的反对处理率。随后,我们将OGD学习扩展到嘈杂的梯度反馈案例,并建立最后的收敛结果 - 第一个定性几乎确定的收敛,然后是定量的有限时间收敛率 - 所有这些融合率 - 均在非降低的步骤下。据我们所知,我们提供了第一组结果,这些结果填补了现有的多项式在线学习文献的几个空白,在该文献中,有限的时间收敛速度,非降低阶跃尺寸和完全自适应算法的三个方面尚未探索。
In this paper, we consider multi-agent learning via online gradient descent in a class of games called $λ$-cocoercive games, a fairly broad class of games that admits many Nash equilibria and that properly includes unconstrained strongly monotone games. We characterize the finite-time last-iterate convergence rate for joint OGD learning on $λ$-cocoercive games; further, building on this result, we develop a fully adaptive OGD learning algorithm that does not require any knowledge of problem parameter (e.g. cocoercive constant $λ$) and show, via a novel double-stopping time technique, that this adaptive algorithm achieves same finite-time last-iterate convergence rate as non-adaptive counterpart. Subsequently, we extend OGD learning to the noisy gradient feedback case and establish last-iterate convergence results -- first qualitative almost sure convergence, then quantitative finite-time convergence rates -- all under non-decreasing step-sizes. To our knowledge, we provide the first set of results that fill in several gaps of the existing multi-agent online learning literature, where three aspects -- finite-time convergence rates, non-decreasing step-sizes, and fully adaptive algorithms have been unexplored before.