通过增强学习框架在A/B测试中的动态因果效应评估

论文标题

通过增强学习框架在A/B测试中的动态因果效应评估

Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning Framework

论文作者

Shi, Chengchun, Wang, Xiaoyu, Luo, Shikai, Zhu, Hongtu, Ye, Jieping, Song, Rui

论文摘要

A/B测试或在线实验是一种标准业务策略，可以将新产品与制药，技术和传统行业的旧产品进行比较。在双面市场平台（例如Uber）的在线实验中，出现了重大挑战，那里只有一个单位随着时间的推移接受一系列治疗。在这些实验中，在给定时间的治疗会影响当前的结果以及未来的结果。本文的目的是在这些实验中引入用于进行A/B测试的增强学习框架，同时表征长期治疗效果。我们提出的测试程序允许进行顺序监视和在线更新。它通常适用于不同行业的各种治疗设计。此外，我们系统地研究了测试程序的理论特性（例如大小和功率）。最后，我们将我们的框架应用于模拟数据和从技术公司获得的现实数据示例，以说明其在当前实践中的优势。我们的测试的Python实现可在https://github.com/callmespring/causalrl上获得。

A/B testing, or online experiment is a standard business strategy to compare a new product with an old one in pharmaceutical, technological, and traditional industries. Major challenges arise in online experiments of two-sided marketplace platforms (e.g., Uber) where there is only one unit that receives a sequence of treatments over time. In those experiments, the treatment at a given time impacts current outcome as well as future outcomes. The aim of this paper is to introduce a reinforcement learning framework for carrying A/B testing in these experiments, while characterizing the long-term treatment effects. Our proposed testing procedure allows for sequential monitoring and online updating. It is generally applicable to a variety of treatment designs in different industries. In addition, we systematically investigate the theoretical properties (e.g., size and power) of our testing procedure. Finally, we apply our framework to both simulated data and a real-world data example obtained from a technological company to illustrate its advantage over the current practice. A Python implementation of our test is available at https://github.com/callmespring/CausalRL.

下载PDF全文

下载文献需遵守相关版权规定

论文标题