FUTR3D：3D检测的统一传感器融合框架

论文标题

FUTR3D：3D检测的统一传感器融合框架

FUTR3D: A Unified Sensor Fusion Framework for 3D Detection

论文作者

Chen, Xuanyao, Zhang, Tianyuan, Wang, Yue, Wang, Yilun, Zhao, Hang

论文摘要

传感器融合是许多感知系统（例如自动驾驶和机器人技术）的重要主题。现有的多模式3D检测模型通常涉及定制设计，具体取决于传感器组合或设置。在这项工作中，我们提出了用于3D检测的第一个统一的端到端传感器融合框架，称为FUTR3D，可以在（几乎）任何传感器配置中使用。 FUTR3D采用基于查询的模态敏捷功能采样器（MAFS），以及带有3D检测设置损失的变压器解码器，从而避免使用晚期融合启发式和后处理技巧。我们验证了框架对相机，低分辨率激光雷达，高分辨率激光雷达和雷达的各种组合的有效性。在Nuscenes数据集上，FUTR3D在不同传感器组合的专门设计方法上取得了更好的性能。此外，FUTR3D通过不同的传感器配置实现了极大的灵活性，并实现了低成本的自动驾驶。例如，仅使用相机使用4束灯光激光雷德（FUTR3D）（58.0 MAP），使用32梁底激光雷达（32 Beam Lidar）使用最新的3D检测模型Centerpoint（56.6 MAP）实现PAR性能。

Sensor fusion is an essential topic in many perception systems, such as autonomous driving and robotics. Existing multi-modal 3D detection models usually involve customized designs depending on the sensor combinations or setups. In this work, we propose the first unified end-to-end sensor fusion framework for 3D detection, named FUTR3D, which can be used in (almost) any sensor configuration. FUTR3D employs a query-based Modality-Agnostic Feature Sampler (MAFS), together with a transformer decoder with a set-to-set loss for 3D detection, thus avoiding using late fusion heuristics and post-processing tricks. We validate the effectiveness of our framework on various combinations of cameras, low-resolution LiDARs, high-resolution LiDARs, and Radars. On NuScenes dataset, FUTR3D achieves better performance over specifically designed methods across different sensor combinations. Moreover, FUTR3D achieves great flexibility with different sensor configurations and enables low-cost autonomous driving. For example, only using a 4-beam LiDAR with cameras, FUTR3D (58.0 mAP) achieves on par performance with state-of-the-art 3D detection model CenterPoint (56.6 mAP) using a 32-beam LiDAR.

下载PDF全文

下载文献需遵守相关版权规定

论文标题