NPLDA：用于扬声器验证的深神经PLDA模型

论文标题

NPLDA：用于扬声器验证的深神经PLDA模型

NPLDA: A Deep Neural PLDA Model for Speaker Verification

论文作者

Ramoji, Shreyas, Krishnan, Prashant, Ganapathy, Sriram

论文摘要

说话者验证的最新方法由基于神经网络的嵌入提取器以及后端生成模型（例如概率线性判别分析（PLDA））组成。在这项工作中，我们提出了一种神经网络方法，以在说话者识别中进行后端建模。生成PLDA模型的似然比评分作为区分相似性函数，并使用验证成本优化了分数函数的可学习参数。使用生成PLDA模型参数初始化了所提出的模型，称为神经PLDA（NPLDA）。 NPLDA模型的损耗函数是最小检测成本函数（DCF）的近似值。使用NPLDA模型的说话者识别实验是在Voices数据集中的扬声器验证任务以及SITW挑战数据集中执行的。在这些实验中，使用拟议损失函数优化的NPLDA模型在基于ART PLDA的扬声器验证系统上显着改善。

The state-of-art approach for speaker verification consists of a neural network based embedding extractor along with a backend generative model such as the Probabilistic Linear Discriminant Analysis (PLDA). In this work, we propose a neural network approach for backend modeling in speaker recognition. The likelihood ratio score of the generative PLDA model is posed as a discriminative similarity function and the learnable parameters of the score function are optimized using a verification cost. The proposed model, termed as neural PLDA (NPLDA), is initialized using the generative PLDA model parameters. The loss function for the NPLDA model is an approximation of the minimum detection cost function (DCF). The speaker recognition experiments using the NPLDA model are performed on the speaker verificiation task in the VOiCES datasets as well as the SITW challenge dataset. In these experiments, the NPLDA model optimized using the proposed loss function improves significantly over the state-of-art PLDA based speaker verification system.

下载PDF全文

下载文献需遵守相关版权规定

论文标题