通过自适应知识蒸馏来增强图形神经网络

论文标题

通过自适应知识蒸馏来增强图形神经网络

Boosting Graph Neural Networks via Adaptive Knowledge Distillation

论文作者

Guo, Zhichun, Zhang, Chunhui, Fan, Yujie, Tian, Yijun, Zhang, Chuxu, Chawla, Nitesh

论文摘要

图形神经网络（GNNS）在各种图形挖掘任务上表现出色。尽管不同的GNN可以统一为相同的消息传递框架，但他们从同一图中学习互补的知识。开发知识蒸馏（KD）是为了结合来自多种模型的多样化知识。它将知识从高容量的教师转移到轻量级学生。但是，为避免过度厚度，GNN通常是浅的，这偏离了KD的环境。在这种情况下，我们通过将其益处与模型压缩分开并强调其转移知识的力量来重新访问KD。为此，我们需要解决两个挑战：如何将知识从紧凑的教师转移到具有相同能力的学生；而且，如何利用学生GNN自己的力量学习知识。在本文中，我们提出了一个新型的自适应KD框架，称为BGNN，该框架将知识从多个GNN转移到了学生GNN中。我们还引入了一个自适应温度模块和一个增强权重模块。这些模块指导学生获得有效学习的适当知识。广泛的实验证明了BGNN的有效性。特别是，我们的节点分类提高了3.05％，而对于香草GNN的图形分类提高了6.35％。

Graph neural networks (GNNs) have shown remarkable performance on diverse graph mining tasks. Although different GNNs can be unified as the same message passing framework, they learn complementary knowledge from the same graph. Knowledge distillation (KD) is developed to combine the diverse knowledge from multiple models. It transfers knowledge from high-capacity teachers to a lightweight student. However, to avoid oversmoothing, GNNs are often shallow, which deviates from the setting of KD. In this context, we revisit KD by separating its benefits from model compression and emphasizing its power of transferring knowledge. To this end, we need to tackle two challenges: how to transfer knowledge from compact teachers to a student with the same capacity; and, how to exploit student GNN's own strength to learn knowledge. In this paper, we propose a novel adaptive KD framework, called BGNN, which sequentially transfers knowledge from multiple GNNs into a student GNN. We also introduce an adaptive temperature module and a weight boosting module. These modules guide the student to the appropriate knowledge for effective learning. Extensive experiments have demonstrated the effectiveness of BGNN. In particular, we achieve up to 3.05% improvement for node classification and 6.35% improvement for graph classification over vanilla GNNs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题