使用时间自我安装的嘈杂标签从数据中学习

论文标题

使用时间自我安装的嘈杂标签从数据中学习

Learning from Data with Noisy Labels Using Temporal Self-Ensemble

论文作者

Lee, Jun Ho, Baik, Jae Soon, Hwang, Tae Hwan, Choi, Jun Won

论文摘要

实际数据集中不可避免地有许多错误标签的数据。由于深层神经网络（DNNS）具有记忆噪声标签的巨大能力，因此需要强大的训练方案来防止标记错误降低DNN的概括性能。当前的最新方法提出了一种共同训练方案，该方案使用与小损失相关的样品训练双网络。但是，实际上，同时培训两个网络可能会为计算资源负担。在这项研究中，我们提出了一种简单而有效的健壮培训计划，该计划仅通过培训一个网络来运行。在训练过程中，提出的方法通过从随机梯度下降优化形成的重量轨迹中对中间网络参数进行采样而产生时间自我启动。使用这些自我归档评估的损失总和用于识别错误标记的样品。同时，我们的方法通过将输入数据转换为各种形式，并考虑其协议以识别错误标记的样本来生成多视图预测。通过结合上述指标，我们介绍了提出的{\ it基于自动化的鲁棒训练}（SRT）方法，该方法可以用嘈杂的标签过滤样品，以减少其对训练的影响。广泛使用的公共数据集的实验表明，所提出的方法在某些类别中实现了最新的性能，而无需训练双网络。

There are inevitably many mislabeled data in real-world datasets. Because deep neural networks (DNNs) have an enormous capacity to memorize noisy labels, a robust training scheme is required to prevent labeling errors from degrading the generalization performance of DNNs. Current state-of-the-art methods present a co-training scheme that trains dual networks using samples associated with small losses. In practice, however, training two networks simultaneously can burden computing resources. In this study, we propose a simple yet effective robust training scheme that operates by training only a single network. During training, the proposed method generates temporal self-ensemble by sampling intermediate network parameters from the weight trajectory formed by stochastic gradient descent optimization. The loss sum evaluated with these self-ensembles is used to identify incorrectly labeled samples. In parallel, our method generates multi-view predictions by transforming an input data into various forms and considers their agreement to identify incorrectly labeled samples. By combining the aforementioned metrics, we present the proposed {\it self-ensemble-based robust training} (SRT) method, which can filter the samples with noisy labels to reduce their influence on training. Experiments on widely-used public datasets demonstrate that the proposed method achieves a state-of-the-art performance in some categories without training the dual networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题