TRANSCL：变压器进行强大而灵活的压缩学习

论文标题

TRANSCL：变压器进行强大而灵活的压缩学习

TransCL: Transformer Makes Strong and Flexible Compressive Learning

论文作者

Mou, Chong, Zhang, Jian

论文摘要

压缩学习（CL）是一个新兴框架，可以通过压缩传感（CS）和机器学习来整合信号的收购，直接在少量测量上进行推理任务。它可以是经典图像域方法的有前途的替代方法，并且在保存和计算效率方面具有很大的优势。但是，以前对CL的尝试不仅限于固定的CS比率，该比率缺乏灵活性，而且还限于MNIST/CIFAR样数据集，并且不扩展到复杂的现实世界高分辨率高分辨率（HR）数据或视觉任务。在本文中，提出了一个新型的基于变压器的压缩学习框架，该框架上提出了具有任意CS比率的大规模图像，称为TransCl。具体而言，TransCL首先采用了基于可学习的基于块的压缩感测的策略，并提出了一种灵活的线性投影策略，使CL能够以任意CS比率的有效逐块方式在大规模图像上进行。然后，关于从所有块作为序列的CS测量值，将部署一个基于纯变压器的骨干以用各种面向任务的头部执行视觉任务。我们的足够分析表明，TRANSCL对干扰和对任意CS比率的强大适应性表现出强烈的抵抗力。复杂HR数据的广泛实验表明，所提出的TransCl可以在图像分类和语义分割任务中实现最新性能。特别是，CS比率为$ 10 \％$的TRANSCL几乎可以获得与直接在原始数据上运行时的性能，并且即使CS极低的CS比率为$ 1 \％\％$，仍然可以获得令人满意的性能。我们提出的TransCl的源代码可在\ url {https://github.com/mc-e/transcl/}上获得。

Compressive learning (CL) is an emerging framework that integrates signal acquisition via compressed sensing (CS) and machine learning for inference tasks directly on a small number of measurements. It can be a promising alternative to classical image-domain methods and enjoys great advantages in memory saving and computational efficiency. However, previous attempts on CL are not only limited to a fixed CS ratio, which lacks flexibility, but also limited to MNIST/CIFAR-like datasets and do not scale to complex real-world high-resolution (HR) data or vision tasks. In this paper, a novel transformer-based compressive learning framework on large-scale images with arbitrary CS ratios, dubbed TransCL, is proposed. Specifically, TransCL first utilizes the strategy of learnable block-based compressed sensing and proposes a flexible linear projection strategy to enable CL to be performed on large-scale images in an efficient block-by-block manner with arbitrary CS ratios. Then, regarding CS measurements from all blocks as a sequence, a pure transformer-based backbone is deployed to perform vision tasks with various task-oriented heads. Our sufficient analysis presents that TransCL exhibits strong resistance to interference and robust adaptability to arbitrary CS ratios. Extensive experiments for complex HR data demonstrate that the proposed TransCL can achieve state-of-the-art performance in image classification and semantic segmentation tasks. In particular, TransCL with a CS ratio of $10\%$ can obtain almost the same performance as when operating directly on the original data and can still obtain satisfying performance even with an extremely low CS ratio of $1\%$. The source codes of our proposed TransCL is available at \url{https://github.com/MC-E/TransCL/}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题