通过SQS在强噪声下进行强大的学习

论文标题

通过SQS在强噪声下进行强大的学习

Robust Learning under Strong Noise via SQs

论文作者

Anagnostides, Ioannis, Gouleakis, Themis, Marashian, Ali

论文摘要

这项工作为Kearns统计查询框架的鲁棒性提供了一些新的见解，以抵抗具有挑战性的标签噪声模型。首先，我们建立在\ cite {dblp：journals/corr/abs-2006-04787}的最新结果基础上，该结果显示了在MassArt噪声下显示与分布无关可转化的概念类别的噪声耐受性。具体而言，我们将其表征扩展到更通用的噪声模型，包括Tsybakov模型，该模型通过允许将翻转概率任意接近$ \ frac {1} {2} $来概括Massart条件。 As a corollary, we employ an evolutionary algorithm by \cite{DBLP:conf/colt/KanadeVV10} to obtain the first polynomial time algorithm with arbitrarily small excess error for learning linear threshold functions over any spherically symmetric distribution in the presence of spherically symmetric Tsybakov noise.此外，我们假设访问更强的甲骨文，在该甲骨文中，对于每个标记的示例，我们还获得了其翻转概率。在此模型中，我们表明，每个SQ可学习类都接受了一种有效的学习算法，其中opt + $ $ε$分类错误对于广泛的噪声模型。这种设置实质上概括了在RCN下具有已知噪声率的广泛研究的分类问题，即使噪声函数（即所有点的翻转概率）也对应于非凸优化问题。

This work provides several new insights on the robustness of Kearns' statistical query framework against challenging label-noise models. First, we build on a recent result by \cite{DBLP:journals/corr/abs-2006-04787} that showed noise tolerance of distribution-independently evolvable concept classes under Massart noise. Specifically, we extend their characterization to more general noise models, including the Tsybakov model which considerably generalizes the Massart condition by allowing the flipping probability to be arbitrarily close to $\frac{1}{2}$ for a subset of the domain. As a corollary, we employ an evolutionary algorithm by \cite{DBLP:conf/colt/KanadeVV10} to obtain the first polynomial time algorithm with arbitrarily small excess error for learning linear threshold functions over any spherically symmetric distribution in the presence of spherically symmetric Tsybakov noise. Moreover, we posit access to a stronger oracle, in which for every labeled example we additionally obtain its flipping probability. In this model, we show that every SQ learnable class admits an efficient learning algorithm with OPT + $ε$ misclassification error for a broad class of noise models. This setting substantially generalizes the widely-studied problem of classification under RCN with known noise rate, and corresponds to a non-convex optimization problem even when the noise function -- i.e. the flipping probabilities of all points -- is known in advance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题