CTT-NET：一种用于白内障术后视觉敏锐度预测的多视图交叉变压器

论文标题

CTT-NET：一种用于白内障术后视觉敏锐度预测的多视图交叉变压器

CTT-Net: A Multi-view Cross-token Transformer for Cataract Postoperative Visual Acuity Prediction

论文作者

Wang, Jinhong, Wang, Jingwen, Chen, Tingting, Zheng, Wenhao, Xu, Zhe, Wu, Xingdi, Xu, Wen, Ying, Haochao, Chen, Danny, Wu, Jian

论文摘要

手术是视力敏锐度（VA）障碍的白内障患者唯一可行的治疗方法。在临床上，为了评估白内障手术的必要性，至关重要的是，通过分析多视图光学相干断层扫描（OCT）图像在手术前准确预测术后VA。不幸的是，由于复杂的眼底条件，确定术后VA对于医学专家来说仍然很难。近年来开发了有关此问题的深度学习方法。尽管有效，但这些方法仍然面临几个问题，例如不有效探索多视图OCT图像之间的潜在关系，忽略了临床先验知识（例如术前VA值）的关键作用，并且仅使用缺乏参考的基于回归的指标。在本文中，我们通过分析多视图OCT图像和术前VA提出了一个新型的跨言语变压器网络（CTT-NET），用于术后VA预测。为了有效地融合OCT图像的多视图功能，我们会产生交叉注意的关注，从而限制冗余/不必要的注意力流。此外，我们利用术前VA值来为术后VA预测提供更多信息，并促进观点之间的融合。此外，我们设计了辅助分类损失，以提高模型性能，并更充分地评估VA恢复，从而避免仅使用回归指标来限制。为了评估CTT-NET，我们构建了从合作医院收集的多视图OCT图像数据集。与各种指标中的现有方法相比，一组广泛的实验可以验证我们的模型的有效性。代码可在以下网址提供：https：//github.com/wjh892521292/cataract oct。

Surgery is the only viable treatment for cataract patients with visual acuity (VA) impairment. Clinically, to assess the necessity of cataract surgery, accurately predicting postoperative VA before surgery by analyzing multi-view optical coherence tomography (OCT) images is crucially needed. Unfortunately, due to complicated fundus conditions, determining postoperative VA remains difficult for medical experts. Deep learning methods for this problem were developed in recent years. Although effective, these methods still face several issues, such as not efficiently exploring potential relations between multi-view OCT images, neglecting the key role of clinical prior knowledge (e.g., preoperative VA value), and using only regression-based metrics which are lacking reference. In this paper, we propose a novel Cross-token Transformer Network (CTT-Net) for postoperative VA prediction by analyzing both the multi-view OCT images and preoperative VA. To effectively fuse multi-view features of OCT images, we develop cross-token attention that could restrict redundant/unnecessary attention flow. Further, we utilize the preoperative VA value to provide more information for postoperative VA prediction and facilitate fusion between views. Moreover, we design an auxiliary classification loss to improve model performance and assess VA recovery more sufficiently, avoiding the limitation by only using the regression metrics. To evaluate CTT-Net, we build a multi-view OCT image dataset collected from our collaborative hospital. A set of extensive experiments validate the effectiveness of our model compared to existing methods in various metrics. Code is available at: https://github.com/wjh892521292/Cataract OCT.

下载PDF全文

下载文献需遵守相关版权规定

论文标题