论文标题
在社交媒体的背景下,用于视觉近图检测的数据集和案例研究
Dataset and Case Studies for Visual Near-Duplicates Detection in the Context of Social Media
论文作者
论文摘要
视觉内容通过网络和社交媒体大量传播构成了挑战和机遇。跟踪视觉相似的内容是研究和分析与此类内容传播相关的社会现象的重要任务。在本文中,我们通过构建社交媒体图像的数据集并根据图像检索和几种高级视觉特征提取方法评估视觉近图检索方法来解决这种需求。我们使用我们从社交媒体及其生成的操纵版本中爬网的大型图像数据集评估了这些方法,从而在召回方面呈现了有希望的结果。我们在两个案例研究中证明了该方法的潜力:一个显示创建支持手动内容审查的系统的价值,另一个证明了自动大规模数据分析的有用性。
The massive spread of visual content through the web and social media poses both challenges and opportunities. Tracking visually-similar content is an important task for studying and analyzing social phenomena related to the spread of such content. In this paper, we address this need by building a dataset of social media images and evaluating visual near-duplicates retrieval methods based on image retrieval and several advanced visual feature extraction methods. We evaluate the methods using a large-scale dataset of images we crawl from social media and their manipulated versions we generated, presenting promising results in terms of recall. We demonstrate the potential of this method in two case studies: one that shows the value of creating systems supporting manual content review, and another that demonstrates the usefulness of automatic large-scale data analysis.