logsmooth梯度集中和更紧密的大都会蒙特卡洛

论文标题

logsmooth梯度集中和更紧密的大都会蒙特卡洛

Logsmooth Gradient Concentration and Tighter Runtimes for Metropolized Hamiltonian Monte Carlo

论文作者

Lee, Yin Tat, Shen, Ruoqi, Tian, Kevin

论文摘要

我们表明，梯度规范$ \ | \ nabla f（x）\ | $ for $ x \ sim \ exp（-f（x）$），其中$ f $强烈凸出且平滑，在其平均值上紧紧浓缩。这消除了先前的最新分析分析中的障碍，用于研究良好的大都市蒙特卡洛（HMC）算法，用于从强烈的logconcave分布中进行采样。我们相应地证明了大都市在$ \ tilde {o}（κD）$迭代中的混合，在$ \ tilde {o}（κ^{1.5} \ sqrt {d} +κD +κD）$（dwivedi等。 $（κ/d）^{1/2} $当条件编号$κ$很大时。我们的混合时间分析介绍了几种技术，据我们所知，文献中尚未出现，并且可能具有独立的兴趣，包括限制具有良好电导行为的非凸的限制，以及一种新的还原技术，用于促进弱环境保证在弱保暖性假设下的恒定准确性总变化保证。这是仅使用一阶功能信息来实现对$κ$的线性依赖性的一阶功能信息的第一个高准确性混合时间结果；我们还提供证据表明，对于标准大都市的一阶方法，这种依赖性可能是必需的。

We show that the gradient norm $\|\nabla f(x)\|$ for $x \sim \exp(-f(x))$, where $f$ is strongly convex and smooth, concentrates tightly around its mean. This removes a barrier in the prior state-of-the-art analysis for the well-studied Metropolized Hamiltonian Monte Carlo (HMC) algorithm for sampling from a strongly logconcave distribution. We correspondingly demonstrate that Metropolized HMC mixes in $\tilde{O}(κd)$ iterations, improving upon the $\tilde{O}(κ^{1.5}\sqrt{d} + κd)$ runtime of (Dwivedi et. al. '18, Chen et. al. '19) by a factor $(κ/d)^{1/2}$ when the condition number $κ$ is large. Our mixing time analysis introduces several techniques which to our knowledge have not appeared in the literature and may be of independent interest, including restrictions to a nonconvex set with good conductance behavior, and a new reduction technique for boosting a constant-accuracy total variation guarantee under weak warmness assumptions. This is the first high-accuracy mixing time result for logconcave distributions using only first-order function information which achieves linear dependence on $κ$; we also give evidence that this dependence is likely to be necessary for standard Metropolized first-order methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题