论文标题
提高两相,结果依赖性抽样研究的估计效率
Improving Estimation Efficiency for Two-Phase, Outcome-Dependent Sampling Studies
论文作者
论文摘要
两阶段结果取决于采样(OD)在许多领域中广泛使用,尤其是当某些协变量昂贵和/或难以测量的情况下。对于两阶段OD,条件最大似然(CML)方法非常有吸引力,因为它可以处理零阶段2选择概率并避免对协变量分布进行建模。但是,大多数现有的基于CML的方法仅使用2阶段样本,因此比其他方法效率低。我们提出了一种一般的经验可能性方法,该方法使用CML增加了整个1阶段样本中其他信息以提高估计效率。所提出的方法保持处理零选择概率并避免对协变量分布进行建模的能力,但由于有效地使用了阶段1数据,因此在廉价协变量中或有影响力的协变量时,可以在廉价协变量中或有影响力的协变量中获得大量效率提高。介绍了使用NHANES数据的模拟和真实数据图。
Two-phase outcome dependent sampling (ODS) is widely used in many fields, especially when certain covariates are expensive and/or difficult to measure. For two-phase ODS, the conditional maximum likelihood (CML) method is very attractive because it can handle zero Phase 2 selection probabilities and avoids modeling the covariate distribution. However, most existing CML-based methods use only the Phase 2 sample and thus may be less efficient than other methods. We propose a general empirical likelihood method that uses CML augmented with additional information in the whole Phase 1 sample to improve estimation efficiency. The proposed method maintains the ability to handle zero selection probabilities and avoids modeling the covariate distribution, but can lead to substantial efficiency gains over CML in the inexpensive covariates, or in the influential covariate when a surrogate is available, because of an effective use of the Phase 1 data. Simulations and a real data illustration using NHANES data are presented.