论文标题
性麻烦:将性/性别纳入医疗机器学习以及如何避免他们的常见陷阱
Sex Trouble: Common pitfalls in incorporating sex/gender in medical machine learning and how to avoid them
论文作者
论文摘要
关于性别和性别的虚假假设已深深地嵌入医学系统中,包括它们是二元,静态和一致的。机器学习研究人员必须了解这些假设的性质,以免使它们持续下去。从这个角度来看,我们确定了研究人员在处理性别/性别数据时犯的三个常见错误:“性困惑”,未能识别数据集中的性别是什么或不是意味着什么; “性痴迷”是对大多数应用的相关变量的信念,即特别是在出生时分配的性行为的信念;和“性别/性别滑倒”,即使在只有一个或另一个人的情况下,性别和性别的混合也是如此。然后,我们讨论这些陷阱如何基于电子健康记录数据在机器学习研究中显示,这通常用于从对患者结局的回顾性分析到算法的发展,以预测风险和管理护理。最后,我们提供了一系列建议,涉及机器学习研究人员如何同时提供研究和算法,以更仔细地与性别/性别问题互动,更好地为包括跨性别者在内的所有患者服务。
False assumptions about sex and gender are deeply embedded in the medical system, including that they are binary, static, and concordant. Machine learning researchers must understand the nature of these assumptions in order to avoid perpetuating them. In this perspectives piece, we identify three common mistakes that researchers make when dealing with sex/gender data: "sex confusion", the failure to identity what sex in a dataset does or doesn't mean; "sex obsession", the belief that sex, specifically sex assigned at birth, is the relevant variable for most applications; and "sex/gender slippage", the conflation of sex and gender even in contexts where only one or the other is known. We then discuss how these pitfalls show up in machine learning studies based on electronic health record data, which is commonly used for everything from retrospective analysis of patient outcomes to the development of algorithms to predict risk and administer care. Finally, we offer a series of recommendations about how machine learning researchers can produce both research and algorithms that more carefully engage with questions of sex/gender, better serving all patients, including transgender people.