论文标题

突出对象检测,深度估计和轮廓提取的联合学习

Joint Learning of Salient Object Detection, Depth Estimation and Contour Extraction

论文作者

Zhao, Xiaoqi, Pang, Youwei, Zhang, Lihe, Lu, Huchuan

论文摘要

从深度图归因于颜色独立性,照明不变性和位置歧视中,它可以提供重要的补充信息,以在复杂的环境中提取显着对象。但是,高质量的深度传感器很昂贵,不能广泛应用。虽然一般深度传感器会产生嘈杂且稀疏的深度信息,但它带来了基于深度的网络,具有不可逆转的干扰。在本文中,我们提出了一个新型的多任务和多模式过滤变压器(MMFT)网络,用于RGB-D显着对象检测(SOD)。具体而言,我们统一了三个互补任务:深度估计,显着对象检测和轮廓估计。多任务机制促进了从辅助任务中学习任务感知功能的模型。这样,深度信息就可以完成并净化。此外,我们引入了一个多模式过滤的变压器(MFT)模块,该模块配备了三个特定于模态的过滤器,以生成每种模式的变压器增强功能。提出的模型在测试阶段以无深风格的方式工作。实验表明,它不仅显着超过了多个数据集上的基于深度的RGB-D SOD方法,而且还精确地预测了高质量的深度图和显着轮廓。并且,所得的深度图可以帮助现有的RGB-D SOD方法获得显着的性能增益。源代码将在https://github.com/xiaoqi-zhao-dlut/mmft上公开获得。

Benefiting from color independence, illumination invariance and location discrimination attributed by the depth map, it can provide important supplemental information for extracting salient objects in complex environments. However, high-quality depth sensors are expensive and can not be widely applied. While general depth sensors produce the noisy and sparse depth information, which brings the depth-based networks with irreversible interference. In this paper, we propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD). Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks. In this way, the depth information can be completed and purified. Moreover, we introduce a multi-modal filtered transformer (MFT) module, which equips with three modality-specific filters to generate the transformer-enhanced feature for each modality. The proposed model works in a depth-free style during the testing phase. Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time. And, the resulted depth map can help existing RGB-D SOD methods obtain significant performance gain. The source code will be publicly available at https://github.com/Xiaoqi-Zhao-DLUT/MMFT.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源