论文标题
统一感知解析的深层分组模型
Deep Grouping Model for Unified Perceptual Parsing
论文作者
论文摘要
基于感知的分组过程产生层次和组成图像表示,可帮助人类和机器视觉系统识别异质的视觉概念。示例可以在经典的分层超级像素分割或图像解析作品中找到。但是,由于许多挑战,包括网格形状的CNN特征图与不规则形状的感知分组层次结构之间的固有不相容性,因此在现代CNN的图像分割网络中很大程度上忽略了分组过程。克服这些挑战,我们提出了一个深层分组模型(DGM),该模型紧密地嫁给了两种类型的表示,并定义了一个自下而上的和自上而下的特征交换过程。当在最近的Broden+数据集评估统一感知解析任务的模型时,与其他基于上下文的分割模型相比,它可以实现最新的结果,同时具有较小的计算开销。此外,与现代CNN方法相比,DGM具有更好的解释性。
The perceptual-based grouping process produces a hierarchical and compositional image representation that helps both human and machine vision systems recognize heterogeneous visual concepts. Examples can be found in the classical hierarchical superpixel segmentation or image parsing works. However, the grouping process is largely overlooked in modern CNN-based image segmentation networks due to many challenges, including the inherent incompatibility between the grid-shaped CNN feature map and the irregular-shaped perceptual grouping hierarchy. Overcoming these challenges, we propose a deep grouping model (DGM) that tightly marries the two types of representations and defines a bottom-up and a top-down process for feature exchanging. When evaluating the model on the recent Broden+ dataset for the unified perceptual parsing task, it achieves state-of-the-art results while having a small computational overhead compared to other contextual-based segmentation models. Furthermore, the DGM has better interpretability compared with modern CNN methods.