可以通过sublinear样品学习布尔立方体上的单调概率分布

论文标题

可以通过sublinear样品学习布尔立方体上的单调概率分布

Monotone probability distributions over the Boolean cube can be learned with sublinear samples

论文作者

Rubinfeld, Ronitt, Vasilyan, Arsen

论文摘要

如果将坐标的值从零变为一个只能增加元素的概率，则布尔立方体上的概率分布是单调的。给定布尔立方体上单调分布的样本给定，我们（就我们所知），使用许多在域中有sublinear的样品来学习统计距离的分布的第一个算法。为此，我们开发了描述单调概率分布的结构引理。结构引理对基本测试任务的样本复杂性具有进一步的影响，用于分析布尔立方体上的单调概率分布：我们使用它来为估计单调分布到均匀的距离的任务提供非平凡的上限，并估算单调分布的支持大小。在布尔立方体上的单调概率分布的设置中，我们的算法是第一个在任意（不一定是单调）概率分布的相同测试任务的样品复杂性低于已知下限的样品复杂性。我们学习算法的另一个结果是，用于测试布尔立方体上分布是否单调的任务的改善样品复杂性。

A probability distribution over the Boolean cube is monotone if flipping the value of a coordinate from zero to one can only increase the probability of an element. Given samples of an unknown monotone distribution over the Boolean cube, we give (to our knowledge) the first algorithm that learns an approximation of the distribution in statistical distance using a number of samples that is sublinear in the domain. To do this, we develop a structural lemma describing monotone probability distributions. The structural lemma has further implications to the sample complexity of basic testing tasks for analyzing monotone probability distributions over the Boolean cube: We use it to give nontrivial upper bounds on the tasks of estimating the distance of a monotone distribution to uniform and of estimating the support size of a monotone distribution. In the setting of monotone probability distributions over the Boolean cube, our algorithms are the first to have sample complexity lower than known lower bounds for the same testing tasks on arbitrary (not necessarily monotone) probability distributions. One further consequence of our learning algorithm is an improved sample complexity for the task of testing whether a distribution on the Boolean cube is monotone.

下载PDF全文

下载文献需遵守相关版权规定

论文标题