论文标题

具有异质信号的高维变量选择:精确的渐近视角

High-dimensional variable selection with heterogeneous signals: A precise asymptotic perspective

论文作者

Roy, Saptarshi, Tewari, Ambuj, Zhu, Ziwei

论文摘要

当信号较弱,罕见且可能异质时,我们研究了独立高斯设计下高维稀疏线性回归的精确支持恢复的问题。在样本量和信号稀疏度的适当缩放下,我们以信息理论的最佳速率固定最小信号幅度,并研究最佳子集选择(BSS)和边际筛选(MS)程序的渐近选择精度。我们表明,尽管有理想的设置,但令人惊讶的是,边缘筛选可能无法实现精确的恢复,而在存在异质信号的情况下,概率会收敛到一个,而BSS享有模型一致性,只要最小信号强度高于信息理论理论阈值。为了减轻BSS的计算可行性,我们还提出了一个有效的两阶段算法框架,称为ETS(估计值然后屏幕),该框架由估计步骤和梯度坐标筛选步骤组成,以及在样本大小和稀疏度相同的缩放假设下,我们表明,在相同的信息中,ETS在相同的信息中实现了notectim nefimiend of Mynection conferitimentimentimentimentimentimentimentimentimentimentimentimentimentimentimentions。最后,我们提出了一项模拟研究,将ET与套索和边缘筛查进行了比较。数值结果即使对于样本量,维度和稀疏性的现实值,我们的渐近理论也是一致的。

We study the problem of exact support recovery for high-dimensional sparse linear regression under independent Gaussian design when the signals are weak, rare, and possibly heterogeneous. Under a suitable scaling of the sample size and signal sparsity, we fix the minimum signal magnitude at the information-theoretic optimal rate and investigate the asymptotic selection accuracy of best subset selection (BSS) and marginal screening (MS) procedures. We show that despite the ideal setup, somewhat surprisingly, marginal screening can fail to achieve exact recovery with probability converging to one in the presence of heterogeneous signals, whereas BSS enjoys model consistency whenever the minimum signal strength is above the information-theoretic threshold. To mitigate the computational intractability of BSS, we also propose an efficient two-stage algorithmic framework called ETS (Estimate Then Screen) comprised of an estimation step and gradient coordinate screening step, and under the same scaling assumption on sample size and sparsity, we show that ETS achieves model consistency under the same information-theoretic optimal requirement on the minimum signal strength as BSS. Finally, we present a simulation study comparing ETS with LASSO and marginal screening. The numerical results agree with our asymptotic theory even for realistic values of the sample size, dimension and sparsity.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源