论文标题
当个性化危害时:重新考虑在预测中使用组属性
When Personalization Harms: Reconsidering the Use of Group Attributes in Prediction
论文作者
论文摘要
机器学习模型通常通过受保护,敏感,自我报告或昂贵的分类属性进行个性化化。在这项工作中,我们展示了具有小组属性个性化的模型可以降低小组级别的性能。我们提出正式条件,以确保通过培训另外一个模型(即集体偏好保证)确保每个提供个人数据的小组的“合理使用”在预测任务中的“合理使用”,以确保在绩效中获得量身定制的收益。我们提出了足够的条件,以确保在经验风险最小化和表征故障模式中进行公平使用,从而导致由于模型开发和部署的标准实践而导致公平使用违规行为。我们介绍了一项关于临床预测任务中合理使用的全面实证研究。我们的结果表明,实践中公平使用违规的率很高,并说明了简单的干预措施以减轻其伤害。
Machine learning models are often personalized with categorical attributes that are protected, sensitive, self-reported, or costly to acquire. In this work, we show models that are personalized with group attributes can reduce performance at a group level. We propose formal conditions to ensure the "fair use" of group attributes in prediction tasks by training one additional model -- i.e., collective preference guarantees to ensure that each group who provides personal data will receive a tailored gain in performance in return. We present sufficient conditions to ensure fair use in empirical risk minimization and characterize failure modes that lead to fair use violations due to standard practices in model development and deployment. We present a comprehensive empirical study of fair use in clinical prediction tasks. Our results demonstrate the prevalence of fair use violations in practice and illustrate simple interventions to mitigate their harm.