具有对比度学习和多尺度图卷积网络的深层图像聚类

论文标题

具有对比度学习和多尺度图卷积网络的深层图像聚类

Deep Image Clustering with Contrastive Learning and Multi-scale Graph Convolutional Networks

论文作者

Xu, Yuankun, Huang, Dong, Wang, Chang-Dong, Lai, Jian-Huang

论文摘要

深层聚类表明，通过深层神经网络在联合表示学习和聚类中表现出了有希望的能力。尽管取得了重大进展，但现有的深群集起作用主要利用了一些基于分布的聚类损失，缺乏统一表示学习和多尺度结构学习的能力。为了解决这个问题，本文提出了一种新的深度聚类方法，称为图像聚类，与对比度学习和多尺度图卷积网络（IcicleGCN），弥合了卷积神经网络（CNN）与图形卷积网络（GCN）之间的差距，以及相比性学习和多层策略学习之间的差距。我们的框架由四个主要模块组成，即基于CNN的主链，实例相似性模块（ISM），关节群集结构学习和实例重建模块（JC-SLIM）以及多规模GCN模块（M-GCN）。具体而言，利用具有两个重量共享视图的骨干网络来学习两个增强样品的表示形式（来自每个图像）。然后，学到的表示形式分别为ISM和JC-SLIM进行联合实例级别和集群级的对比学习，在此期间，JC-SLIM中的自动编码器也被预估计是通往M-GCN模块的桥梁。此外，为了实施多尺度的邻域结构学习，通过（i）通过（i）表示层次融合的层相互作用和（ii）联合自适应学习的层相互作用，同时训练了两个GCN和自动编码器。多个图像数据集上的实验证明了IcicleGCN优于最先进的群集性能。该代码可在https://github.com/xuyuankun631/iciclegcn上找到。

Deep clustering has shown its promising capability in joint representation learning and clustering via deep neural networks. Despite the significant progress, the existing deep clustering works mostly utilize some distribution-based clustering loss, lacking the ability to unify representation learning and multi-scale structure learning. To address this, this paper presents a new deep clustering approach termed image clustering with contrastive learning and multi-scale graph convolutional networks (IcicleGCN), which bridges the gap between convolutional neural network (CNN) and graph convolutional network (GCN) as well as the gap between contrastive learning and multi-scale structure learning for the deep clustering task. Our framework consists of four main modules, namely, the CNN-based backbone, the Instance Similarity Module (ISM), the Joint Cluster Structure Learning and Instance reconstruction Module (JC-SLIM), and the Multi-scale GCN module (M-GCN). Specifically, the backbone network with two weight-sharing views is utilized to learn the representations for the two augmented samples (from each image). The learned representations are then fed to ISM and JC-SLIM for joint instance-level and cluster-level contrastive learning, respectively, during which an auto-encoder in JC-SLIM is also pretrained to serve as a bridge to the M-GCN module. Further, to enforce multi-scale neighborhood structure learning, two streams of GCNs and the auto-encoder are simultaneously trained via (i) the layer-wise interaction with representation fusion and (ii) the joint self-adaptive learning. Experiments on multiple image datasets demonstrate the superior clustering performance of IcicleGCN over the state-of-the-art. The code is available at https://github.com/xuyuankun631/IcicleGCN.

下载PDF全文

下载文献需遵守相关版权规定

论文标题