论文标题
具有惊人密度的可解释子图:亚组发现方法
Explainable Subgraphs with Surprising Densities: A Subgroup Discovery Approach
论文作者
论文摘要
图的连接结构通常与节点的属性有关。例如,在社交网络中,两个人之间建立友谊的可能性取决于他们的属性,例如他们的年龄,地址和爱好。因此,图形的连通性因此可能可以通过形式的模式来理解:“具有属性X的个体的子组通常是(或很少)与一个具有属性y'的另一个子组中的个人的朋友。这样的规则对图表提出了潜在的可行和可推广的见解。我们提出了一种方法,该方法使用信息理论定义有趣的定义,可以在边缘密度之间有趣的高度或低点之间。这种兴趣是主观量化的,以与分析师对图的先验信息形成鲜明对比。此视图立即实现了这种模式的迭代采矿。我们的工作将先前的工作概括为密集的子图挖掘(即由单个子组引起的子图)。此外,不仅提出的方法更一般,我们还为单个子组特殊情况展示了相当大的实际优势。
The connectivity structure of graphs is typically related to the attributes of the nodes. In social networks for example, the probability of a friendship between two people depends on their attributes, such as their age, address, and hobbies. The connectivity of a graph can thus possibly be understood in terms of patterns of the form 'the subgroup of individuals with properties X are often (or rarely) friends with individuals in another subgroup with properties Y'. Such rules present potentially actionable and generalizable insights into the graph. We present a method that finds pairs of node subgroups between which the edge density is interestingly high or low, using an information-theoretic definition of interestingness. This interestingness is quantified subjectively, to contrast with prior information an analyst may have about the graph. This view immediately enables iterative mining of such patterns. Our work generalizes prior work on dense subgraph mining (i.e. subgraphs induced by a single subgroup). Moreover, not only is the proposed method more general, we also demonstrate considerable practical advantages for the single subgroup special case.