论文标题
深层多模式结构方程,以实现非结构化代理的因果效应估计
Deep Multi-Modal Structural Equations For Causal Effect Estimation With Unstructured Proxies
论文作者
论文摘要
在考虑混淆变量的同时,从观察数据中估算干预效果是因果推断的关键任务。通常情况下,混杂因素没有观察到,但是我们可以访问大量附加的非结构化数据(图像,文本),这些数据包含有关缺失混杂因素的有价值的代理信号。本文认为,利用这种非结构化数据可以大大提高因果效应估计的准确性。具体而言,我们引入了深层多模式结构方程,这是一个因果效应估计的生成模型,在该模型中,混杂因素是潜在变量,非结构化数据是代理变量。该模型支持多个多模式代理(图像,文本)以及缺少数据。我们从经验上证明,我们的方法根据倾向得分优于现有方法,并纠正使用基因组学和医疗保健任务的非结构化输入混淆。我们的方法可以潜在地支持使用以前在因果推理中不使用的大量数据
Estimating the effect of intervention from observational data while accounting for confounding variables is a key task in causal inference. Oftentimes, the confounders are unobserved, but we have access to large amounts of additional unstructured data (images, text) that contain valuable proxy signal about the missing confounders. This paper argues that leveraging this unstructured data can greatly improve the accuracy of causal effect estimation. Specifically, we introduce deep multi-modal structural equations, a generative model for causal effect estimation in which confounders are latent variables and unstructured data are proxy variables. This model supports multiple multi-modal proxies (images, text) as well as missing data. We empirically demonstrate that our approach outperforms existing methods based on propensity scores and corrects for confounding using unstructured inputs on tasks in genomics and healthcare. Our methods can potentially support the use of large amounts of data that were previously not used in causal inference