VSVC：基于语音纹理选择和语音转换对关键字发现的后门攻击

论文标题

VSVC：基于语音纹理选择和语音转换对关键字发现的后门攻击

VSVC: Backdoor attack against Keyword Spotting based on Voiceprint Selection and Voice Conversion

论文作者

Cai, Hanbo, Zhang, Pengcheng, Dong, Hai, Xiao, Yan, Ji, Shunhui

论文摘要

基于深神经网络（DNNS）的关键字斑点（KWS）在语音控制方案中取得了巨大成功。但是，对这种基于DNN的KWS系统的培训通常需要大量的数据和硬件资源。制造商经常将此过程委托给第三方平台。这使得训练过程无法控制，攻击者可以通过操纵第三方培训数据来植入模型中的后门。有效的后门攻击可以迫使模型在某些条件下（即触发器）做出指定的判断。在本文中，我们根据语音选择和语音转换设计了一个后门攻击方案，缩写为VSVC。实验结果表明，在四个受害者模型中毒害训练数据的1％时，VSVC是可行的，可以在四个受害者模型中获得接近97％的平均攻击成功率。

Keyword spotting (KWS) based on deep neural networks (DNNs) has achieved massive success in voice control scenarios. However, training of such DNN-based KWS systems often requires significant data and hardware resources. Manufacturers often entrust this process to a third-party platform. This makes the training process uncontrollable, where attackers can implant backdoors in the model by manipulating third-party training data. An effective backdoor attack can force the model to make specified judgments under certain conditions, i.e., triggers. In this paper, we design a backdoor attack scheme based on Voiceprint Selection and Voice Conversion, abbreviated as VSVC. Experimental results demonstrated that VSVC is feasible to achieve an average attack success rate close to 97% in four victim models when poisoning less than 1% of the training data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题