论文标题
图像浮肿的隐私机器学习
Image Obfuscation for Privacy-Preserving Machine Learning
论文作者
论文摘要
当将机器学习培训(ML)模型外包给提供机器学习服务的基于云的平台时,隐私将成为一个至关重要的问题。尽管已经开发了基于加密原始图的解决方案,但它们在准确性或训练效率上造成了重大损失,并且需要对后端体系结构进行修改。我们在本文中面临的一个关键挑战是图像混淆方案的设计,该计划提供了足够的隐私,而不会显着降低ML模型的准确性和培训过程的效率。在这项工作中,我们解决了迄今为止一直存在的另一个挑战:量化视觉混淆机制提供的隐私程度。我们比较了最先进的全参考质量指标与人类受试者同意的能力,从一系列技术引入的混淆程度上。通过依靠用户调查和两个图像数据集,我们表明两个现有的图像质量指标也非常适合根据人类受试者和基于AI的识别来衡量隐私水平,因此可用于量化因混淆而产生的隐私。凭借量化隐私的能力,我们表明我们可以为设置的培训图像提供足够的隐私保护,而准确性仅几个百分点损失。
Privacy becomes a crucial issue when outsourcing the training of machine learning (ML) models to cloud-based platforms offering machine-learning services. While solutions based on cryptographic primitives have been developed, they incur a significant loss in accuracy or training efficiency, and require modifications to the backend architecture. A key challenge we tackle in this paper is the design of image obfuscation schemes that provide enough privacy without significantly degrading the accuracy of the ML model and the efficiency of the training process. In this endeavor, we address another challenge that has persisted so far: quantifying the degree of privacy provided by visual obfuscation mechanisms. We compare the ability of state-of-the-art full-reference quality metrics to concur with human subjects in terms of the degree of obfuscation introduced by a range of techniques. By relying on user surveys and two image datasets, we show that two existing image quality metrics are also well suited to measure the level of privacy in accordance with human subjects as well as AI-based recognition, and can therefore be used for quantifying privacy resulting from obfuscation. With the ability to quantify privacy, we show that we can provide adequate privacy protection to the training image set at the cost of only a few percentage points loss in accuracy.