论文标题
调整通过大样本渐近学推断统计推断的随机梯度算法
Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics
论文作者
论文摘要
调整用于优化和采样的随机梯度算法(SGA)通常是基于启发式方法和反复试验,而不是可推广的理论。我们通过通过关节级样本大小的缩放限制来表征SGA的大样本统计渐近差异来解决这一理论 - 实践差距。我们表明,具有较大固定步长的迭代平均对于调谐参数的选择是可靠的,并且渐近地具有与MLE采样分布成正比的协方差。我们还证明了Bernstein-von Mises样定理可以指导调整,包括适用于模拟错误指定的广义后代。数值实验验证了我们在现实的有限样本制度中的结果和建议。我们的工作为对各种模型的其他随机梯度马尔可链蒙特卡洛算法进行系统分析奠定了基础。
The tuning of stochastic gradient algorithms (SGAs) for optimization and sampling is often based on heuristics and trial-and-error rather than generalizable theory. We address this theory--practice gap by characterizing the large-sample statistical asymptotics of SGAs via a joint step-size--sample-size scaling limit. We show that iterate averaging with a large fixed step size is robust to the choice of tuning parameters and asymptotically has covariance proportional to that of the MLE sampling distribution. We also prove a Bernstein--von Mises-like theorem to guide tuning, including for generalized posteriors that are robust to model misspecification. Numerical experiments validate our results and recommendations in realistic finite-sample regimes. Our work lays the foundation for a systematic analysis of other stochastic gradient Markov chain Monte Carlo algorithms for a wide range of models.