论文标题
语义分割的多尺度和跨尺度对比度学习
Multi-scale and Cross-scale Contrastive Learning for Semantic Segmentation
论文作者
论文摘要
这项工作考虑了对对比度学习的监督语义细分学习。我们应用对比度学习来增强语义分割网络提取的多尺度功能的判别能力。我们的关键方法论洞察力是利用来自模型编码器本身的多个阶段发出的特征空间的样本,既不需要数据增强,也不需要在线内存库来获取各种样本。为了允许这样的扩展,我们引入了一个高效且有效的采样过程,可以在多个尺度上对编码器的特征应用对比损失。此外,通过首先将编码器的多尺度表示形式映射到一个共同的特征空间,我们通过引入跨尺度对比度学习将高分辨率的本地特征与低分辨率全球特征联系起来,从而实例化了一种新型的监督局部全球限制。当对4种不同模型的损失(deeplabv3,hrnet,ocrnet,upernet)和CNN和Transformer骨架上的各种模型的性能相结合,当对4种来自天然(CityScapes,PascalContext,ade20k)但也是外表(CADSIS)的Natural(CityScapes,pascalcontext,ade20k)的数据集进行评估时。我们的代码可在https://github.com/rvimlab/ms_cs_contrseg上找到。来自天然(CityScapes,PascalContext,ADE20K)的数据集,但也是外科(CADIS)域。
This work considers supervised contrastive learning for semantic segmentation. We apply contrastive learning to enhance the discriminative power of the multi-scale features extracted by semantic segmentation networks. Our key methodological insight is to leverage samples from the feature spaces emanating from multiple stages of a model's encoder itself requiring neither data augmentation nor online memory banks to obtain a diverse set of samples. To allow for such an extension we introduce an efficient and effective sampling process, that enables applying contrastive losses over the encoder's features at multiple scales. Furthermore, by first mapping the encoder's multi-scale representations to a common feature space, we instantiate a novel form of supervised local-global constraint by introducing cross-scale contrastive learning linking high-resolution local features to low-resolution global features. Combined, our multi-scale and cross-scale contrastive losses boost performance of various models (DeepLabV3, HRNet, OCRNet, UPerNet) with both CNN and Transformer backbones, when evaluated on 4 diverse datasets from natural (Cityscapes, PascalContext, ADE20K) but also surgical (CaDIS) domains. Our code is available at https://github.com/RViMLab/MS_CS_ContrSeg. datasets from natural (Cityscapes, PascalContext, ADE20K) but also surgical (CaDIS) domains.