论文标题

编码:区分压缩和加密的文件片段

EnCoD: Distinguishing Compressed and Encrypted File Fragments

论文作者

De Gaspari, Fabio, Hitaj, Dorjan, Pagnotta, Giulio, De Carli, Lorenzo, Mancini, Luigi V.

论文摘要

对加密文件片段的可靠识别是多个安全应用程序的要求,包括勒索软件检测,数字取证和流量分析。一种流行的方法包括估计高熵作为随机性的代理。但是,许多现代内容类型(例如Office文档,媒体文件等)高度压缩,以用于存储和传输效率。压缩算法还输出高渗透数据,从而降低了基于熵的加密检测器的准确性。多年来,已经提出了多种方法来区分加密文件片段和高渗透压压缩片段。但是,这些方法通常仅在几个,选择的数据类型和片段大小上进行评估,这使得对其实际适用性进行了公平的评估。本文旨在通过比较大型标准化数据集上的现有统计测试来缩小这一差距。我们的结果表明,即使对于大碎片大小,当前的方法也无法可靠地分开加密和压缩。为了解决此问题,我们设计了一个基于学习的分类器编码,可以可靠地区分压缩和加密数据,从片段小至512个字节开始。我们根据不同数据类型的大数据集对当前方法进行了评估,这表明它的表现优于当前最新的片段大小和数据类型。

Reliable identification of encrypted file fragments is a requirement for several security applications, including ransomware detection, digital forensics, and traffic analysis. A popular approach consists of estimating high entropy as a proxy for randomness. However, many modern content types (e.g. office documents, media files, etc.) are highly compressed for storage and transmission efficiency. Compression algorithms also output high-entropy data, thus reducing the accuracy of entropy-based encryption detectors. Over the years, a variety of approaches have been proposed to distinguish encrypted file fragments from high-entropy compressed fragments. However, these approaches are typically only evaluated over a few, select data types and fragment sizes, which makes a fair assessment of their practical applicability impossible. This paper aims to close this gap by comparing existing statistical tests on a large, standardized dataset. Our results show that current approaches cannot reliably tell apart encryption and compression, even for large fragment sizes. To address this issue, we design EnCoD, a learning-based classifier which can reliably distinguish compressed and encrypted data, starting with fragments as small as 512 bytes. We evaluate EnCoD against current approaches over a large dataset of different data types, showing that it outperforms current state-of-the-art for most considered fragment sizes and data types.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源