基于事件的异步稀疏卷积网络

论文标题

基于事件的异步稀疏卷积网络

Event-based Asynchronous Sparse Convolutional Networks

论文作者

Messikommer, Nico, Gehrig, Daniel, Loquercio, Antonio, Scaramuzza, Davide

论文摘要

事件摄像机是受生物启发的传感器，以异步和稀疏的“事件”的形式响应人均亮度变化。最近，模式识别算法（例如基于学习的方法）通过将事件转换为同步致密的，类似图像的表示并应用为标准摄像机开发的传统机器学习方法，从而在事件摄像机中取得了重大进展。但是，这些方法以更高的计算复杂性和延迟为代价丢弃事件数据中固有的空间和时间稀疏性。在这项工作中，我们提出了一个通用框架，用于将训练在同步图像的事件表示的模型转换为具有相同输出的异步模型，从而直接利用事件数据的固有异步和稀疏性质。我们在理论上和实验上都表明，这大大降低了高容量，同步神经网络的计算复杂性和潜伏期而不牺牲准确性。此外，我们的框架具有几个理想的特征：（i）它明确地利用事件的时空稀疏性，（ii）它不符合事件表示，网络体系结构和任务，并且（iii）它不需要任何火车时间更改，因为它与标准的神经网络培训过程兼容。我们在两个计算机视觉任务上彻底验证了所提出的框架：对象检测和对象识别。在这些任务中，我们将计算复杂性降低了20倍，相对于高延迟神经网络。同时，我们的预测准确性高达最高24％的最高差异方法。

Event cameras are bio-inspired sensors that respond to per-pixel brightness changes in the form of asynchronous and sparse "events". Recently, pattern recognition algorithms, such as learning-based methods, have made significant progress with event cameras by converting events into synchronous dense, image-like representations and applying traditional machine learning methods developed for standard cameras. However, these approaches discard the spatial and temporal sparsity inherent in event data at the cost of higher computational complexity and latency. In this work, we present a general framework for converting models trained on synchronous image-like event representations into asynchronous models with identical output, thus directly leveraging the intrinsic asynchronous and sparse nature of the event data. We show both theoretically and experimentally that this drastically reduces the computational complexity and latency of high-capacity, synchronous neural networks without sacrificing accuracy. In addition, our framework has several desirable characteristics: (i) it exploits spatio-temporal sparsity of events explicitly, (ii) it is agnostic to the event representation, network architecture, and task, and (iii) it does not require any train-time change, since it is compatible with the standard neural networks' training process. We thoroughly validate the proposed framework on two computer vision tasks: object detection and object recognition. In these tasks, we reduce the computational complexity up to 20 times with respect to high-latency neural networks. At the same time, we outperform state-of-the-art asynchronous approaches up to 24% in prediction accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题