论文标题
通过不断发展的阶级本体的持续学习
Continual Learning with Evolving Class Ontologies
论文作者
论文摘要
终身学习者必须识别随着时间的流逝而发展的概念词汇。一个常见但又不充分的方案是通过不断完善/扩展旧类的类标签学习。例如,人类学会在狗品种之前识别$ {\ tt狗} $。在实际设置中,数据集$ \ textIt {dodysing} $通常会引入本体论,例如将以前的$ {\ tt车辆} $ class改进$ {\ tt school-tt school-bus} $作为自动运营扩展到新城市的$ {\ tt school-tth-bus} $。本文为研究$ \ textIt {通过不断发展的类本体学习} $(LECO)的问题进行了正式的协议。 LECO需要在不同的时间段(TPS)中学习分类器;每个TP都引入了一个“精细”标签的新本体,该标签完善了“粗”标签的旧本体(例如,狗品种,可以完善先前的$ {\ tt dog} $)。 LECO探索了诸如是注释新数据还是重新标记旧数据,如何利用粗糙标签,以及是否对先前的TP型号或从头开始训练。为了回答这些问题,我们利用相关问题(例如课堂学习学习)的见解。我们通过图像分类(CIFAR和Inturalist)和语义分割(Mapillary)的镜头在LECO方案下验证它们。我们的实验得出了令人惊讶的结论。虽然当前的现状是将现有的数据集用新的本体(例如可可至LVIS或MAPILLARY1.2-2.0)重新标记,但LECO证明了一个更好的策略是注释$ \ textit {new} $ data具有新的本体学。但是,这会产生一个总体数据集,旧VS新标签不一致,从而使学习变得复杂。为了应对这一挑战,我们采用了半监督和部分标签学习的方法。令人惊讶的是,这些策略可以近乎最佳,接近一种“甲骨文”,该“甲骨文”在总体数据集上学习了最新的本体学标签。
Lifelong learners must recognize concept vocabularies that evolve over time. A common yet underexplored scenario is learning with class labels that continually refine/expand old classes. For example, humans learn to recognize ${\tt dog}$ before dog breeds. In practical settings, dataset $\textit{versioning}$ often introduces refinement to ontologies, such as autonomous vehicle benchmarks that refine a previous ${\tt vehicle}$ class into ${\tt school-bus}$ as autonomous operations expand to new cities. This paper formalizes a protocol for studying the problem of $\textit{Learning with Evolving Class Ontology}$ (LECO). LECO requires learning classifiers in distinct time periods (TPs); each TP introduces a new ontology of "fine" labels that refines old ontologies of "coarse" labels (e.g., dog breeds that refine the previous ${\tt dog}$). LECO explores such questions as whether to annotate new data or relabel the old, how to leverage coarse labels, and whether to finetune the previous TP's model or train from scratch. To answer these questions, we leverage insights from related problems such as class-incremental learning. We validate them under the LECO protocol through the lens of image classification (CIFAR and iNaturalist) and semantic segmentation (Mapillary). Our experiments lead to surprising conclusions; while the current status quo is to relabel existing datasets with new ontologies (such as COCO-to-LVIS or Mapillary1.2-to-2.0), LECO demonstrates that a far better strategy is to annotate $\textit{new}$ data with the new ontology. However, this produces an aggregate dataset with inconsistent old-vs-new labels, complicating learning. To address this challenge, we adopt methods from semi-supervised and partial-label learning. Such strategies can surprisingly be made near-optimal, approaching an "oracle" that learns on the aggregate dataset exhaustively labeled with the newest ontology.