减轻自我监督建议的伪造相关性

论文标题

减轻自我监督建议的伪造相关性

Mitigating Spurious Correlations for Self-supervised Recommendation

论文作者

Lin, Xinyu, Xu, Yiyan, Wang, Wenjie, Zhang, Yang, Feng, Fuli

论文摘要

近年来，在推荐系统中见证了自我监督学习（SSL）的巨大成功。但是，SSL推荐模型可能会遭受虚假相关性的影响，从而导致概括不良。为了减轻虚假的相关性，现有工作通常追求基于ID的SSL建议或利用功能工程来识别虚假功能。然而，基于ID的SSL方法牺牲了不变特征的积极影响，而功能工程方法则需要高成本的人类标签。为了解决这些问题，我们旨在自动减轻虚假相关性的影响。该目标需要1）在没有监督的情况下自动掩盖伪造功能，以及2）在SSL期间将负效应从虚假特征到其他功能的负面影响。为了应对这两个挑战，我们提出了一个不变的功能学习框架，该框架首先将用户项目交互分为具有分配变化的多个环境，然后学习一个功能掩码机制，以捕获跨环境的不变特征。基于遮罩机制，我们可以删除虚假特征以进行健壮的预测，并通过掩码引导的特征增强来阻止负面影响。在两个数据集上进行的广泛实验证明了该框架在减轻伪造相关性和提高SSL模型的概括能力方面的有效性。该代码可在https://github.com/linxyhaha/ifl上找到。

Recent years have witnessed the great success of self-supervised learning (SSL) in recommendation systems. However, SSL recommender models are likely to suffer from spurious correlations, leading to poor generalization. To mitigate spurious correlations, existing work usually pursues ID-based SSL recommendation or utilizes feature engineering to identify spurious features. Nevertheless, ID-based SSL approaches sacrifice the positive impact of invariant features, while feature engineering methods require high-cost human labeling. To address the problems, we aim to automatically mitigate the effect of spurious correlations. This objective requires to 1) automatically mask spurious features without supervision, and 2) block the negative effect transmission from spurious features to other features during SSL. To handle the two challenges, we propose an invariant feature learning framework, which first divides user-item interactions into multiple environments with distribution shifts and then learns a feature mask mechanism to capture invariant features across environments. Based on the mask mechanism, we can remove the spurious features for robust predictions and block the negative effect transmission via mask-guided feature augmentation. Extensive experiments on two datasets demonstrate the effectiveness of the proposed framework in mitigating spurious correlations and improving the generalization abilities of SSL models. The code is available at https://github.com/Linxyhaha/IFL.

下载PDF全文

下载文献需遵守相关版权规定

论文标题