从数十亿个数据点开始，以高效高斯流程回归的额定级别表示

论文标题

从数十亿个数据点开始，以高效高斯流程回归的额定级别表示

Equispaced Fourier representations for efficient Gaussian process regression from a billion data points

论文作者

Greengard, Philip, Rachh, Manas, Barnett, Alex

论文摘要

我们引入了一种基于傅立叶的快速算法，用于低维度的高斯过程回归。它近似于$ m $ m $ nodes的皇家频率网格上的复杂指数，近似于翻译不变的协方差内核。这会导致带有Toeplitz结构的重量空间$ M $ M $系统矩阵，因此可以通过快速傅立叶变换（FFT）将其应用于$ {\ Mathcal O}（M \ Mathcal O}（M \ log {M}）$操作，与数据点$ N $无关。线性系统可以在$ {\ Mathcal O}（n + m \ log {m}）$使用Unrisur FFT中设置。这可以通过迭代求解器进行有效的大规模回归，即使对于具有脂肪尾部密度的内核（大$ M $）也是如此。我们在内核近似和后平均误差上提供边界。在一个，两个和三个维度中，在一个，两个维度和三个维度中的平方指数和Matérn内核的数值实验通常以可比的精度比最先进的等级结构求解器上的1-2个数量级加速度。我们的方法允许2DMatérn-$ \ mbox {$ \ frac {3} {2} $} $从$ n = 10^9 $数据点在2分钟内在标准台式机上执行，并具有后端桌面，具有后级均值精度$ 10^{ - 3} $。这为空间统计应用程序打开了比以前可能大的100倍。

We introduce a Fourier-based fast algorithm for Gaussian process regression in low dimensions. It approximates a translationally-invariant covariance kernel by complex exponentials on an equispaced Cartesian frequency grid of $M$ nodes. This results in a weight-space $M\times M$ system matrix with Toeplitz structure, which can thus be applied to a vector in ${\mathcal O}(M \log{M})$ operations via the fast Fourier transform (FFT), independent of the number of data points $N$. The linear system can be set up in ${\mathcal O}(N + M \log{M})$ operations using nonuniform FFTs. This enables efficient massive-scale regression via an iterative solver, even for kernels with fat-tailed spectral densities (large $M$). We provide bounds on both kernel approximation and posterior mean errors. Numerical experiments for squared-exponential and Matérn kernels in one, two and three dimensions often show 1-2 orders of magnitude acceleration over state-of-the-art rank-structured solvers at comparable accuracy. Our method allows 2D Matérn-$\mbox{$\frac{3}{2}$}$ regression from $N=10^9$ data points to be performed in 2 minutes on a standard desktop, with posterior mean accuracy $10^{-3}$. This opens up spatial statistics applications 100 times larger than previously possible.

下载PDF全文

下载文献需遵守相关版权规定

论文标题