多标签对比度预测编码

论文标题

多标签对比度预测编码

Multi-label Contrastive Predictive Coding

论文作者

Song, Jiaming, Ermon, Stefano

论文摘要

各变化信息（MI）估计量被广泛用于无监督的表示方法，例如对比度预测编码（CPC）。可以从多级分类问题获得MI的下限，在该问题中，批评家试图区分从$（M-1）$否定的样品中从合适的提案分布中得出的$（M-1）$负样品。使用这种方法，MI估计值以上是$ \ log M $的界限，因此，除非$ m $非常大，否则可能会严重低估。为了克服这一限制，我们基于多标签分类问题引入了一个新颖的估计器，批评家需要同时共同识别多个积极样本。我们表明，使用相同数量的负样本，多标签CPC能够超过$ \ log m $绑定，同时仍然是共同信息的有效下限。我们证明，所提出的方法能够带来更好的相互信息估计，在无监督的表示学习中获得经验改进，并在13个任务中的10个任务中击败了当前最新的知识蒸馏方法。

Variational mutual information (MI) estimators are widely used in unsupervised representation learning methods such as contrastive predictive coding (CPC). A lower bound on MI can be obtained from a multi-class classification problem, where a critic attempts to distinguish a positive sample drawn from the underlying joint distribution from $(m-1)$ negative samples drawn from a suitable proposal distribution. Using this approach, MI estimates are bounded above by $\log m$, and could thus severely underestimate unless $m$ is very large. To overcome this limitation, we introduce a novel estimator based on a multi-label classification problem, where the critic needs to jointly identify multiple positive samples at the same time. We show that using the same amount of negative samples, multi-label CPC is able to exceed the $\log m$ bound, while still being a valid lower bound of mutual information. We demonstrate that the proposed approach is able to lead to better mutual information estimation, gain empirical improvements in unsupervised representation learning, and beat a current state-of-the-art knowledge distillation method over 10 out of 13 tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题