深度可实现复发性神经网络的长期记忆

论文标题

深度可实现复发性神经网络的长期记忆

Depth Enables Long-Term Memory for Recurrent Neural Networks

论文作者

Ziv, Alon

论文摘要

推动现代复发性神经网络（RNN）对涉及顺序数据的学习任务的前所未有的成功的关键属性，是它们对复杂的长期时间依赖性建模的能力。但是，缺乏对RNN的长期记忆能力的良好衡量，因此，对深度对其在整个时间关联数据的影响的影响的正式理解是有限的。具体而言，卷积网络上的现有深度效率结果不足以说明深度RNN在不同长度的数据上的成功。为了解决这个问题，我们介绍了网络跨时间支持信息流的能力的度量，称为开始端分离等级，这反映了复发网络实现的函数距离，从对输入序列的开始和结束之间的不依赖性进行建模。我们证明，深度复发网络支持起始分离等级，这些等级在组合上高于其浅层对应物的组合等级。因此，我们确定深度带来了重复网络对长期依赖性建模的能力的压倒性优势，并提供了量化此关键属性的示例。我们通过广泛的实验评估，使用限制隐藏的对立矩阵为正交的优化技术，通过广泛的实验评估来证明对常见RNN的讨论现象。最后，我们采用量子张量网络的工具来获得有关经常性网络深度带来的复杂性的其他图形见解。

A key attribute that drives the unprecedented success of modern Recurrent Neural Networks (RNNs) on learning tasks which involve sequential data, is their ability to model intricate long-term temporal dependencies. However, a well established measure of RNNs long-term memory capacity is lacking, and thus formal understanding of the effect of depth on their ability to correlate data throughout time is limited. Specifically, existing depth efficiency results on convolutional networks do not suffice in order to account for the success of deep RNNs on data of varying lengths. In order to address this, we introduce a measure of the network's ability to support information flow across time, referred to as the Start-End separation rank, which reflects the distance of the function realized by the recurrent network from modeling no dependency between the beginning and end of the input sequence. We prove that deep recurrent networks support Start-End separation ranks which are combinatorially higher than those supported by their shallow counterparts. Thus, we establish that depth brings forth an overwhelming advantage in the ability of recurrent networks to model long-term dependencies, and provide an exemplar of quantifying this key attribute. We empirically demonstrate the discussed phenomena on common RNNs through extensive experimental evaluation using the optimization technique of restricting the hidden-to-hidden matrix to being orthogonal. Finally, we employ the tool of quantum Tensor Networks to gain additional graphic insights regarding the complexity brought forth by depth in recurrent networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题