基于空间意识到的多任务学习语音分离

论文标题

基于空间意识到的多任务学习语音分离

Spatial Aware Multi-Task Learning Based Speech Separation

论文作者

Sun, Wei, Wang, Mei, Qiu, Lili

论文摘要

在协证期间，在线会议已成为我们生活中必不可少的一部分。由于它们的便利性和广泛的影响力，这种趋势可能会持续下去。但是，其他家庭成员，室友，办公室同伴的背景噪音不仅会降低语音质量，而且会引发严重的隐私问题。在本文中，我们开发了一个新型系统，称为“空间意识到基于多任务学习的分离（SAMS）”，以在电话会议期间从目标用户中提取音频信号。我们的解决方案由三个新颖组成部分组成：（i）从用户的声音和听不清的跟踪声音中生成细粒度的位置嵌入，其中包含用户的位置和丰富的多径信息，（ii）使用多任务源学习开发源分离神经网络，以共同优化源源分离和位置，以及（iii）可显着加快上升到实时保证的加速。我们的测试床实验证明了我们方法的有效性

During the Covid, online meetings have become an indispensable part of our lives. This trend is likely to continue due to their convenience and broad reach. However, background noise from other family members, roommates, office-mates not only degrades the voice quality but also raises serious privacy issues. In this paper, we develop a novel system, called Spatial Aware Multi-task learning-based Separation (SAMS), to extract audio signals from the target user during teleconferencing. Our solution consists of three novel components: (i) generating fine-grained location embeddings from the user's voice and inaudible tracking sound, which contains the user's position and rich multipath information, (ii) developing a source separation neural network using multi-task learning to jointly optimize source separation and location, and (iii) significantly speeding up inference to provide a real-time guarantee. Our testbed experiments demonstrate the effectiveness of our approach

下载PDF全文

下载文献需遵守相关版权规定

论文标题