论文标题

最大通信网络的优化和概括分析

An Optimization and Generalization Analysis for Max-Pooling Networks

论文作者

Brutzkus, Alon, Globerson, Amir

论文摘要

Max-Pooling操作是深度学习体系结构的核心组成部分。特别是,它们是机器视觉中使用的大多数卷积体系结构的一部分,因为合并是一种自然的模式检测问题方法。但是,从理论的角度来看,这些体系结构并未得到很好的理解。例如,我们不了解它们何时可以在全球优化的情况下,以及过度参数化对概括的影响是什么。在这里,我们对卷积最大式架构进行了理论分析,证明它可以在全球优化,甚至可以很好地推广到高度参数化的模型。我们的分析重点是受模式检测问题启发的数据生成分布,在“伪造”模式中需要检测到“歧视性”模式。我们从经验上验证了CNN在我们的环境中明显优于完全连接的网络,如我们的理论结果所预测。

Max-Pooling operations are a core component of deep learning architectures. In particular, they are part of most convolutional architectures used in machine vision, since pooling is a natural approach to pattern detection problems. However, these architectures are not well understood from a theoretical perspective. For example, we do not understand when they can be globally optimized, and what is the effect of over-parameterization on generalization. Here we perform a theoretical analysis of a convolutional max-pooling architecture, proving that it can be globally optimized, and can generalize well even for highly over-parameterized models. Our analysis focuses on a data generating distribution inspired by pattern detection problem, where a "discriminative" pattern needs to be detected among "spurious" patterns. We empirically validate that CNNs significantly outperform fully connected networks in our setting, as predicted by our theoretical results.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源