论文标题
班级意识的对比度半监督学习
Class-Aware Contrastive Semi-Supervised Learning
论文作者
论文摘要
基于伪标签的半监督学习(SSL)在原始数据利用率上取得了巨大的成功。但是,由于自我生成的人工标记中包含的噪声,其训练程序受到确认偏差的影响。此外,该模型的判断在具有广泛分布数据的真实应用程序中变得更加嘈杂。为了解决这个问题,我们提出了一种名为“ class-Aware对比度半监督学习”(CCSSL)的通用方法,该方法是提高伪标签质量并增强现实世界中模型的鲁棒性的插手。我们的方法不是将现实世界数据视为一个联合集合,而是分别处理可靠的分布数据,并将其融合到下游任务中,并将其与图像对比度融合到下游任务中,以更好地泛化。此外,通过应用目标重新加权,我们成功地强调了干净的标签学习,并同时减少了嘈杂的标签学习。尽管它很简单,但我们提出的CCSSL比标准数据集CIFAR100和STL10上最新的SSL方法具有显着的性能改进。在现实世界数据集Semi-Inat 2021上,我们将FixMatch提高了9.80%,COMATCH提高了3.18%。代码可用https://github.com/tencentyouturesearch/classification-spoomls。
Pseudo-label-based semi-supervised learning (SSL) has achieved great success on raw data utilization. However, its training procedure suffers from confirmation bias due to the noise contained in self-generated artificial labels. Moreover, the model's judgment becomes noisier in real-world applications with extensive out-of-distribution data. To address this issue, we propose a general method named Class-aware Contrastive Semi-Supervised Learning (CCSSL), which is a drop-in helper to improve the pseudo-label quality and enhance the model's robustness in the real-world setting. Rather than treating real-world data as a union set, our method separately handles reliable in-distribution data with class-wise clustering for blending into downstream tasks and noisy out-of-distribution data with image-wise contrastive for better generalization. Furthermore, by applying target re-weighting, we successfully emphasize clean label learning and simultaneously reduce noisy label learning. Despite its simplicity, our proposed CCSSL has significant performance improvements over the state-of-the-art SSL methods on the standard datasets CIFAR100 and STL10. On the real-world dataset Semi-iNat 2021, we improve FixMatch by 9.80% and CoMatch by 3.18%. Code is available https://github.com/TencentYoutuResearch/Classification-SemiCLS.