论文标题
使用能力释放扩散的局部超图聚类
Local Hypergraph Clustering using Capacity Releasing Diffusion
论文作者
论文摘要
本地图聚类是一项重要的机器学习任务,旨在在一组种子节点附近找到一个良好的群集。最近的结果表明,合并高阶信息可显着增强图形聚类技术的结果。该领域的大多数现有研究都集中在基于光谱图理论的技术上。但是,对本地图聚类的另一种观点是由在目标上使用最大流量和最低切割的观点,这些观点提供了明显不同的保证。例如,最近提出了一种称为容量释放扩散(CRD)的新方法,并证明可以比光谱方法更好地保护种子周围的局部结构。该方法也是第一种局部聚类技术,它通过假设在种子节点附近的良好簇来不受二次脸颊不等式的影响。在本文中,我们通过将CRD过程扩展到基于高阶模式的群集,将CRD过程扩展到群集,该过程称为HyperGraph CRD(HG-CRD),该技术被编码为HyperGraph的超牢。此外,我们从理论上表明,HG-CRD给出了称为基序电导的数量的结果,而不是先前实验中使用的偏置版本。合成数据集和现实世界图的实验结果表明,HG-CRD提高了聚类质量。
Local graph clustering is an important machine learning task that aims to find a well-connected cluster near a set of seed nodes. Recent results have revealed that incorporating higher order information significantly enhances the results of graph clustering techniques. The majority of existing research in this area focuses on spectral graph theory-based techniques. However, an alternative perspective on local graph clustering arises from using max-flow and min-cut on the objectives, which offer distinctly different guarantees. For instance, a new method called capacity releasing diffusion (CRD) was recently proposed and shown to preserve local structure around the seeds better than spectral methods. The method was also the first local clustering technique that is not subject to the quadratic Cheeger inequality by assuming a good cluster near the seed nodes. In this paper, we propose a local hypergraph clustering technique called hypergraph CRD (HG-CRD) by extending the CRD process to cluster based on higher order patterns, encoded as hyperedges of a hypergraph. Moreover, we theoretically show that HG-CRD gives results about a quantity called motif conductance, rather than a biased version used in previous experiments. Experimental results on synthetic datasets and real world graphs show that HG-CRD enhances the clustering quality.