论文标题
数据驱动的正规推理隐私
Data-driven Regularized Inference Privacy
论文作者
论文摘要
服务提供商广泛使用数据作为推理系统的输入,以执行授权任务的决策。但是,原始数据允许服务提供商推断尚未授权的其他敏感信息。我们提出了一个数据驱动的推理隐私保护框架,以消毒数据,以防止原始数据中存在的敏感信息泄漏,同时确保消毒数据仍然与服务提供商的遗产推理系统兼容。我们基于变异方法开发了一个推理隐私框架,并包括最大的平均差异和域适应性,作为将消毒数据域正规化的技术,以确保其遗产兼容性。但是,在很难近似基础数据分布的情况下,变化方法会导致弱隐私。处理连续的私人变量时,它也可能会遇到困难。为了克服这一点,我们提出了使用最大相关性对隐私指标的替代表述,并提出了经验方法来估计它。最后,我们开发了一个深度学习模型,作为提出的推理隐私框架的一个例子。数值实验验证了我们方法的可行性。
Data is used widely by service providers as input to inference systems to perform decision making for authorized tasks. The raw data however allows a service provider to infer other sensitive information it has not been authorized for. We propose a data-driven inference privacy preserving framework to sanitize data so as to prevent leakage of sensitive information that is present in the raw data, while ensuring that the sanitized data is still compatible with the service provider's legacy inference system. We develop an inference privacy framework based on the variational method and include maximum mean discrepancy and domain adaption as techniques to regularize the domain of the sanitized data to ensure its legacy compatibility. However, the variational method leads to weak privacy in cases where the underlying data distribution is hard to approximate. It may also face difficulties when handling continuous private variables. To overcome this, we propose an alternative formulation of the privacy metric using maximal correlation and we present empirical methods to estimate it. Finally, we develop a deep learning model as an example of the proposed inference privacy framework. Numerical experiments verify the feasibility of our approach.