论文标题
密集的Hebbian神经网络:监督学习的副本对称图片
Dense Hebbian neural networks: a replica symmetric picture of supervised learning
论文作者
论文摘要
我们考虑了由教师训练的密集,关联的神经网络(即有监督),并通过旋转眼镜的统计力学和数值来分析他们的计算能力,并通过Monte Carlo Simulations进行了数值研究。 In particular, we obtain a phase diagram summarizing their performance as a function of the control parameters such as quality and quantity of the training dataset, network storage and noise, that is valid in the limit of large network size and structureless datasets: these networks may work in a ultra-storage regime (where they can handle a huge amount of patterns, if compared with shallow neural networks) or in a ultra-detection regime (where they can perform pattern recognition at prohibitive信噪比,如果与浅神经网络相比)。在随机理论作为参考框架的指导下,我们还测试了这些网络在结构化数据集上显示的MNIST和时尚MNIST上显示的数值学习,存储和检索功能。如技术评论,从分析方面,我们在Guerra的插值中实施了巨大的偏差和稳定分析,以应对涉及后突触潜力的非高斯分布,而从计算对应物中,我们插入了Plefka的近似值,在蒙特核心方案中逐步调查了跨越范围,以加快对范围的研究,以加快范围,以加快范围,并在整体上进行了范围,从而使整体上的信息范围远远超过了一部分,并广泛地涉及一部分,并广泛地进行了范围,并广泛地涉及一部分,整体上的范围,整体上的范围,范围广泛,范围广泛,并广泛地介绍了一部分的范围。通常,浅限制。
We consider dense, associative neural-networks trained by a teacher (i.e., with supervision) and we investigate their computational capabilities analytically, via statistical-mechanics of spin glasses, and numerically, via Monte Carlo simulations. In particular, we obtain a phase diagram summarizing their performance as a function of the control parameters such as quality and quantity of the training dataset, network storage and noise, that is valid in the limit of large network size and structureless datasets: these networks may work in a ultra-storage regime (where they can handle a huge amount of patterns, if compared with shallow neural networks) or in a ultra-detection regime (where they can perform pattern recognition at prohibitive signal-to-noise ratios, if compared with shallow neural networks). Guided by the random theory as a reference framework, we also test numerically learning, storing and retrieval capabilities shown by these networks on structured datasets as MNist and Fashion MNist. As technical remarks, from the analytic side, we implement large deviations and stability analysis within Guerra's interpolation to tackle the not-Gaussian distributions involved in the post-synaptic potentials while, from the computational counterpart, we insert Plefka approximation in the Monte Carlo scheme, to speed up the evaluation of the synaptic tensors, overall obtaining a novel and broad approach to investigate supervised learning in neural networks, beyond the shallow limit, in general.