多模式空中视图对象分类的多模式域融合

论文标题

多模式空中视图对象分类的多模式域融合

Multi-Modal Domain Fusion for Multi-modal Aerial View Object Classification

论文作者

Udupa, Sumanth, Sikdar, Aniruddh, Sundaram, Suresh

论文摘要

使用航空图像的对象检测和分类是一项具有挑战性的任务，因为有关目标的信息并不丰富。合成孔径雷达（SAR）图像可用于自动目标识别系统（ATR）系统，因为它可以在全天候条件和弱光设置中运行。但是，SAR图像包含盐和胡椒噪声（斑点噪声），这会导致深度学习模型的障碍以提取有意义的特征。仅使用航空视图电流（EO）图像作为ATR系统也可能不会导致高精度，因为这些图像的分辨率低，并且在极端天气条件下也没有提供充足的信息。因此，可以使用来自多个传感器的信息来增强自动目标识别（ATR）系统的性能。在本文中，我们探讨了一种使用EO和SAR传感器信息的方法，通过处理每个传感器的缺点来有效地提高ATR系统的性能。提出了一种新型的多模式域融合（MDF）网络，以从多模式数据中学习域不变特征，并使用它来准确地对空中视图对象进行分类。拟议的MDF网络在Track-1中以25.3％的精度和Top-5性能在TRACK-2中获得了前10个性能，在PBVS MAVOC挑战数据集中的测试阶段的精度为34.26％[18]。

Object detection and classification using aerial images is a challenging task as the information regarding targets are not abundant. Synthetic Aperture Radar(SAR) images can be used for Automatic Target Recognition(ATR) systems as it can operate in all-weather conditions and in low light settings. But, SAR images contain salt and pepper noise(speckle noise) that cause hindrance for the deep learning models to extract meaningful features. Using just aerial view Electro-optical(EO) images for ATR systems may also not result in high accuracy as these images are of low resolution and also do not provide ample information in extreme weather conditions. Therefore, information from multiple sensors can be used to enhance the performance of Automatic Target Recognition(ATR) systems. In this paper, we explore a methodology to use both EO and SAR sensor information to effectively improve the performance of the ATR systems by handling the shortcomings of each of the sensors. A novel Multi-Modal Domain Fusion(MDF) network is proposed to learn the domain invariant features from multi-modal data and use it to accurately classify the aerial view objects. The proposed MDF network achieves top-10 performance in the Track-1 with an accuracy of 25.3 % and top-5 performance in Track-2 with an accuracy of 34.26 % in the test phase on the PBVS MAVOC Challenge dataset [18].

下载PDF全文

下载文献需遵守相关版权规定

论文标题