论文标题

与协变量网络的贝叶斯社区检测

Bayesian community detection for networks with covariates

论文作者

Shen, Luyi, Amini, Arash, Josephs, Nathaniel, Lin, Lizhen

论文摘要

在各种领域中,网络数据的普遍性日益增加,并且需要从它们中提取有用信息的需求促使相关模型和算法中的快速发展。在具有网络数据的各种学习任务中,可以说是在科学界最大程度地关注的节点集群或“社区”的发现。在许多实际应用程序中,网络数据通常以节点或边缘协变量的形式带有其他信息,理想情况下应该利用这些信息进行推理。在本文中,我们通过提出具有协变量依赖性随机分区的贝叶斯随机块模型,添加了有关具有协变量网络的社区检测的有限文献。根据我们的先验,在指定群集成员资格的先验分布时,明确表示协变量。我们的模型具有对包括社区成员在内的所有参数估计值的不确定性进行建模的灵活性。重要的是,与大多数现有方法不同,我们的模型可以通过后推断学习社区数量,而无需假设它是已知的。我们的模型可以应用于具有分类和连续协变量的密集和稀疏网络中的社区检测,并且我们的MCMC算法具有良好的混合特性非常有效。在全面的仿真研究中,我们证明了模型比现有模型的优越性能,并应用于两个真实数据集的应用。

The increasing prevalence of network data in a vast variety of fields and the need to extract useful information out of them have spurred fast developments in related models and algorithms. Among the various learning tasks with network data, community detection, the discovery of node clusters or "communities," has arguably received the most attention in the scientific community. In many real-world applications, the network data often come with additional information in the form of node or edge covariates that should ideally be leveraged for inference. In this paper, we add to a limited literature on community detection for networks with covariates by proposing a Bayesian stochastic block model with a covariate-dependent random partition prior. Under our prior, the covariates are explicitly expressed in specifying the prior distribution on the cluster membership. Our model has the flexibility of modeling uncertainties of all the parameter estimates including the community membership. Importantly, and unlike the majority of existing methods, our model has the ability to learn the number of the communities via posterior inference without having to assume it to be known. Our model can be applied to community detection in both dense and sparse networks, with both categorical and continuous covariates, and our MCMC algorithm is very efficient with good mixing properties. We demonstrate the superior performance of our model over existing models in a comprehensive simulation study and an application to two real datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源