平滑降压功能上梯度方法的紧密收敛速率

论文标题

平滑降压功能上梯度方法的紧密收敛速率

Tight convergence rates of the gradient method on smooth hypoconvex functions

论文作者

Rotaru, Teodor, Glineur, François, Patrinos, Panagiotis

论文摘要

当应用于光滑的hypoconvex（弱凸）函数时，我们对梯度方法进行了第一个紧密收敛分析。 HypoConvex函数是平滑的非凸功能，其曲率构成并被认为属于Interval $ [μ，L] $，$μ<0 $。我们的收敛速率通过$ l $ -Lipschitz梯度（对应于$μ= -l $），改善并扩展了平滑非凸功能的现有分析，并在该类别和平滑凸功能类别之间平稳插值。我们使用适合降压功能的绩效估计框架获得了结果，该框架得出了新的插值条件。我们根据迭代尺寸的最小梯度规范来得出明确的上限，以解释为什么所有此类速率都具有共同的结构，并证明当台阶尺寸较小或等于$ 1/L $时，这些速率很紧。最后，我们确定最佳的常数步长大小，从而最大程度地减少了应用于HypoConvex函数的梯度方法的最坏情况。

We perform the first tight convergence analysis of the gradient method with varying step sizes when applied to smooth hypoconvex (weakly convex) functions. Hypoconvex functions are smooth nonconvex functions whose curvature is bounded and assumed to belong to the interval $[μ, L]$, with $μ<0$. Our convergence rates improve and extend the existing analysis for smooth nonconvex functions with $L$-Lipschitz gradient (which corresponds to the case $μ=-L$), and smoothly interpolates between that class and the class of smooth convex functions. We obtain our results using the performance estimation framework adapted to hypoconvex functions, for which new interpolation conditions are derived. We derive explicit upper bounds on the minimum gradient norm of the iterates for a large range of step sizes, explain why all such rates share a common structure, and prove that these rates are tight when step sizes are smaller or equal to $1/L$. Finally, we identify the optimal constant step size that minimizes the worst-case of the gradient method applied to hypoconvex functions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题