论文标题

一个自我调整的融合表示模型,用于未对齐的文本原告序列

A Self-Adjusting Fusion Representation Learning Model for Unaligned Text-Audio Sequences

论文作者

Yang, Kaicheng, Zhang, Ruxuan, Xu, Hua, Gao, Kai

论文摘要

模式间相互作用在多模式情感分析中起着必不可少的作用。由于序列的不同模式,通常是不结合的,因此如何整合每种模态的相关信息以学习融合表示已成为多模式学习的核心挑战之一。在本文中,提出了一种自调整的融合表示学习模型(SA-FRLM),以直接从未对齐的文本和音频序列中学习强大的跨模式融合表示。与以前的作品不同,我们的模型不仅充分利用了不同模态之间的相互作用,而且还可以最大程度地利用单峰特征的保护。具体而言,我们首先采用跨模式对齐模块将不同的模态特征投影到相同的维度。然后采用跨模式的协作注意,以建模文本和音频序列之间的模式间相互作用并初始化融合表示。之后,作为SA-FRLM的核心单位,提出了跨模式调节变压器来保护原始的单峰特性。它可以通过使用单个模态流动地调整融合表示。我们评估了公共多模式分析数据集CMU-MOSI和CMU-MOSEI的方法。实验结果表明,我们的模型已显着提高了在未对准的文本序列上所有指标的性能。

Inter-modal interaction plays an indispensable role in multimodal sentiment analysis. Due to different modalities sequences are usually non-alignment, how to integrate relevant information of each modality to learn fusion representations has been one of the central challenges in multimodal learning. In this paper, a Self-Adjusting Fusion Representation Learning Model (SA-FRLM) is proposed to learn robust crossmodal fusion representations directly from the unaligned text and audio sequences. Different from previous works, our model not only makes full use of the interaction between different modalities but also maximizes the protection of the unimodal characteristics. Specifically, we first employ a crossmodal alignment module to project different modalities features to the same dimension. The crossmodal collaboration attention is then adopted to model the inter-modal interaction between text and audio sequences and initialize the fusion representations. After that, as the core unit of the SA-FRLM, the crossmodal adjustment transformer is proposed to protect original unimodal characteristics. It can dynamically adapt the fusion representations by using single modal streams. We evaluate our approach on the public multimodal sentiment analysis datasets CMU-MOSI and CMU-MOSEI. The experiment results show that our model has significantly improved the performance of all the metrics on the unaligned text-audio sequences.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源