论文标题
学习视频对象细分的质量感知的动态记忆
Learning Quality-aware Dynamic Memory for Video Object Segmentation
论文作者
论文摘要
最近,几种基于空间内存的方法已经验证了将中间框架及其掩码作为内存的存储有助于将目标对象分割为视频中的对象。但是,他们主要集中于在当前帧和内存框架之间更好地匹配,而无需明确关注内存质量。因此,较差的分割面罩的框架容易被记住,从而导致分割掩盖误差问题并进一步影响分割性能。此外,随着帧数的增长,内存框架的线性增加还限制了模型处理长视频的能力。为此,我们提出了一个质量感知的动态内存网络(QDMN),以评估每个帧的分割质量,从而使内存库可以选择性地存储准确的分段框架,以防止错误积累问题。然后,我们将细分质量与时间一致性相结合,以动态更新内存库以提高模型的实用性。我们的QDMN没有任何铃铛和哨子,可以在戴维斯和YouTube-Vos基准测试中实现新的最先进性能。此外,广泛的实验表明,提议的质量评估模块(QAM)可以作为通用插件应用于基于内存的方法,并显着提高性能。我们的源代码可在https://github.com/workforai/qdmn上找到。
Recently, several spatial-temporal memory-based methods have verified that storing intermediate frames and their masks as memory are helpful to segment target objects in videos. However, they mainly focus on better matching between the current frame and the memory frames without explicitly paying attention to the quality of the memory. Therefore, frames with poor segmentation masks are prone to be memorized, which leads to a segmentation mask error accumulation problem and further affect the segmentation performance. In addition, the linear increase of memory frames with the growth of frame number also limits the ability of the models to handle long videos. To this end, we propose a Quality-aware Dynamic Memory Network (QDMN) to evaluate the segmentation quality of each frame, allowing the memory bank to selectively store accurately segmented frames to prevent the error accumulation problem. Then, we combine the segmentation quality with temporal consistency to dynamically update the memory bank to improve the practicability of the models. Without any bells and whistles, our QDMN achieves new state-of-the-art performance on both DAVIS and YouTube-VOS benchmarks. Moreover, extensive experiments demonstrate that the proposed Quality Assessment Module (QAM) can be applied to memory-based methods as generic plugins and significantly improves performance. Our source code is available at https://github.com/workforai/QDMN.