SCP-GAN：自我纠正的歧视器优化用于培训一致性，以保持语音增强任务的度量

论文标题

SCP-GAN：自我纠正的歧视器优化用于培训一致性，以保持语音增强任务的度量

SCP-GAN: Self-Correcting Discriminator Optimization for Training Consistency Preserving Metric GAN on Speech Enhancement Tasks

论文作者

Zadorozhnyy, Vasily, Ye, Qiang, Koishida, Kazuhito

论文摘要

近年来，生成的对抗网络（GAN）在语音增强（SE）任务中产生了显着改善的结果。但是，他们很难训练。在这项工作中，我们介绍了GAN培训方案的一些改进，可以应用于大多数基于GAN的SE模型。我们建议使用一致性损失函数，该函数针对由傅立叶和逆傅立叶变换引起的时间和时频域的不一致。我们还提出了对SE任务进行培训的自我校正优化，这有助于避免歧视损失功能的部分“有害”训练方向。我们已经在几种基于GAN的SE模型上测试了我们提出的方法，并获得了一致的改进，包括语音库+需求数据集的新最新结果。

In recent years, Generative Adversarial Networks (GANs) have produced significantly improved results in speech enhancement (SE) tasks. They are difficult to train, however. In this work, we introduce several improvements to the GAN training schemes, which can be applied to most GAN-based SE models. We propose using consistency loss functions, which target the inconsistency in time and time-frequency domains caused by Fourier and Inverse Fourier Transforms. We also present self-correcting optimization for training a GAN discriminator on SE tasks, which helps avoid "harmful" training directions for parts of the discriminator loss function. We have tested our proposed methods on several state-of-the-art GAN-based SE models and obtained consistent improvements, including new state-of-the-art results for the Voice Bank+DEMAND dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题