突出对象检测，深度估计和轮廓提取的联合学习

论文标题

突出对象检测，深度估计和轮廓提取的联合学习

Joint Learning of Salient Object Detection, Depth Estimation and Contour Extraction

论文作者

Zhao, Xiaoqi, Pang, Youwei, Zhang, Lihe, Lu, Huchuan

论文摘要

从深度图归因于颜色独立性，照明不变性和位置歧视中，它可以提供重要的补充信息，以在复杂的环境中提取显着对象。但是，高质量的深度传感器很昂贵，不能广泛应用。虽然一般深度传感器会产生嘈杂且稀疏的深度信息，但它带来了基于深度的网络，具有不可逆转的干扰。在本文中，我们提出了一个新型的多任务和多模式过滤变压器（MMFT）网络，用于RGB-D显着对象检测（SOD）。具体而言，我们统一了三个互补任务：深度估计，显着对象检测和轮廓估计。多任务机制促进了从辅助任务中学习任务感知功能的模型。这样，深度信息就可以完成并净化。此外，我们引入了一个多模式过滤的变压器（MFT）模块，该模块配备了三个特定于模态的过滤器，以生成每种模式的变压器增强功能。提出的模型在测试阶段以无深风格的方式工作。实验表明，它不仅显着超过了多个数据集上的基于深度的RGB-D SOD方法，而且还精确地预测了高质量的深度图和显着轮廓。并且，所得的深度图可以帮助现有的RGB-D SOD方法获得显着的性能增益。源代码将在https://github.com/xiaoqi-zhao-dlut/mmft上公开获得。

Benefiting from color independence, illumination invariance and location discrimination attributed by the depth map, it can provide important supplemental information for extracting salient objects in complex environments. However, high-quality depth sensors are expensive and can not be widely applied. While general depth sensors produce the noisy and sparse depth information, which brings the depth-based networks with irreversible interference. In this paper, we propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD). Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks. In this way, the depth information can be completed and purified. Moreover, we introduce a multi-modal filtered transformer (MFT) module, which equips with three modality-specific filters to generate the transformer-enhanced feature for each modality. The proposed model works in a depth-free style during the testing phase. Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time. And, the resulted depth map can help existing RGB-D SOD methods obtain significant performance gain. The source code will be publicly available at https://github.com/Xiaoqi-Zhao-DLUT/MMFT.

下载PDF全文

下载文献需遵守相关版权规定

论文标题