使用隐私的联合学习和领域适应的多站点fMRI分析：遵守结果

论文标题

使用隐私的联合学习和领域适应的多站点fMRI分析：遵守结果

Multi-site fMRI Analysis Using Privacy-preserving Federated Learning and Domain Adaptation: ABIDE Results

论文作者

Li, Xiaoxiao, Gu, Yufeng, Dvornek, Nicha, Staib, Lawrence, Ventola, Pamela, Duncan, James S.

论文摘要

深度学习模型在许多不同的任务中表现出了它们的优势，包括神经图像分析。但是，要有效训练高质量的深度学习模型，需要大量患者信息的聚合。例如，组装中获取和注释的时间和成本，例如，大型fMRI数据集使在一个站点上很难获取大量。但是，由于需要保护患者数据的隐私，因此很难从多个机构中组装中央数据库。联合学习允许通过将全球模型传输到本地实体，在本地训练该模型，然后平均全球模型中的梯度或权重来培训人群级别的模型，而无需集中实体的数据。但是，一些研究表明，可以从模型梯度或权重中回收私人信息。在这项工作中，我们通过隐私策略解决了多站点fMRI分类的问题。为了解决问题，我们提出了一种联合学习方法，在该方法中，实施了分散的迭代优化算法，并通过随机机制改变了共享的本地模型权重。考虑到来自不同站点的fMRI分布的系统差异，我们进一步提出了该联合学习公式中的两种领域适应方法。我们研究联合模型优化的各个实际方面，并将联合学习与替代培训策略进行比较。总体而言，我们的结果表明，可以利用多站点数据而无需数据共享来提高神经图像分析性能并找到可靠的疾病相关生物标志物。我们提出的管道可以推广到其他对隐私敏感的医学数据分析问题。

Deep learning models have shown their advantage in many different tasks, including neuroimage analysis. However, to effectively train a high-quality deep learning model, the aggregation of a significant amount of patient information is required. The time and cost for acquisition and annotation in assembling, for example, large fMRI datasets make it difficult to acquire large numbers at a single site. However, due to the need to protect the privacy of patient data, it is hard to assemble a central database from multiple institutions. Federated learning allows for population-level models to be trained without centralizing entities' data by transmitting the global model to local entities, training the model locally, and then averaging the gradients or weights in the global model. However, some studies suggest that private information can be recovered from the model gradients or weights. In this work, we address the problem of multi-site fMRI classification with a privacy-preserving strategy. To solve the problem, we propose a federated learning approach, where a decentralized iterative optimization algorithm is implemented and shared local model weights are altered by a randomization mechanism. Considering the systemic differences of fMRI distributions from different sites, we further propose two domain adaptation methods in this federated learning formulation. We investigate various practical aspects of federated model optimization and compare federated learning with alternative training strategies. Overall, our results demonstrate that it is promising to utilize multi-site data without data sharing to boost neuroimage analysis performance and find reliable disease-related biomarkers. Our proposed pipeline can be generalized to other privacy-sensitive medical data analysis problems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题