Lile：在其他地方进行深入了解 - 使用变压器进行跨模式信息检索的双重注意网络在组织病理学档案中检索

论文标题

Lile：在其他地方进行深入了解 - 使用变压器进行跨模式信息检索的双重注意网络在组织病理学档案中检索

LILE: Look In-Depth before Looking Elsewhere -- A Dual Attention Network using Transformers for Cross-Modal Information Retrieval in Histopathology Archives

论文作者

Maleki, Danial, Tizhoosh, H. R

论文摘要

近年来，在许多应用中，可用数据的数量已急剧增长。此外，分别使用多种模式的网络年龄实际上已经结束。因此，启用双向跨模式数据检索能够处理的能力已成为许多研究领域和学科的要求。在医学领域尤其如此，因为数据有多种类型，包括各种类型的图像和报告以及分子数据。大多数当代作品都将互相注意力强调与其他方式相关的图像或文本的基本要素，并尝试将它们匹配在一起。但是，无论其在自己的方式中的重要性如何，这些方法通常都同样地考虑每种方式的特征。在这项研究中，将提出自我注意事项作为额外的损失项，以丰富提供在交叉注意模块中的内部表示。这项工作提出了一个具有新的损失术语的新型建筑，以帮助表示联合潜在空间中的图像和文本。实验在两个基准数据集（即MS-Coco和Arch）上显示了该方法的有效性。

The volume of available data has grown dramatically in recent years in many applications. Furthermore, the age of networks that used multiple modalities separately has practically ended. Therefore, enabling bidirectional cross-modality data retrieval capable of processing has become a requirement for many domains and disciplines of research. This is especially true in the medical field, as data comes in a multitude of types, including various types of images and reports as well as molecular data. Most contemporary works apply cross attention to highlight the essential elements of an image or text in relation to the other modalities and try to match them together. However, regardless of their importance in their own modality, these approaches usually consider features of each modality equally. In this study, self-attention as an additional loss term will be proposed to enrich the internal representation provided into the cross attention module. This work suggests a novel architecture with a new loss term to help represent images and texts in the joint latent space. Experiment results on two benchmark datasets, i.e. MS-COCO and ARCH, show the effectiveness of the proposed method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题