论文标题

影响诊断的统计和计算保证

Statistical and Computational Guarantees for Influence Diagnostics

论文作者

Fisher, Jillian, Liu, Lang, Pillutla, Krishna, Choi, Yejin, Harchaoui, Zaid

论文摘要

影响诊断,例如影响功能和近似最大影响扰动在机器学习和AI领域应用中很受欢迎。影响诊断是强大的统计工具,可识别数据点的影响数据点或子集。我们建立了有限样本统计界限以及计算复杂性界限,以使用有效的反式矢量产品实现来影响功能并近似最大影响扰动。我们使用广义线性模型和基于合成和真实数据的大型注意模型来说明我们的结果。

Influence diagnostics such as influence functions and approximate maximum influence perturbations are popular in machine learning and in AI domain applications. Influence diagnostics are powerful statistical tools to identify influential datapoints or subsets of datapoints. We establish finite-sample statistical bounds, as well as computational complexity bounds, for influence functions and approximate maximum influence perturbations using efficient inverse-Hessian-vector product implementations. We illustrate our results with generalized linear models and large attention based models on synthetic and real data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源