论文标题

调节自动编码器潜在空间,用于实时音色插值和合成

Conditioning Autoencoder Latent Spaces for Real-Time Timbre Interpolation and Synthesis

论文作者

Colonel, Joseph T, Keene, Sam

论文摘要

我们比较标准自动编码器拓扑的表演性能。我们演示了自动编码器瓶颈中使用的不同激活功能如何分发训练语料库的嵌入。我们表明,瓶颈中的乙状体激活的选择比泄漏的整流线性单位激活产生更界限和均匀分布的嵌入。我们提出了一个单热编码的色度特征矢量,用于输入增强和潜在空间调节。我们测量了这些网络的性能,并表征了使用该色度调节矢量的潜在嵌入。开源的,Python中的实时音色综合算法是概述和共享的。

We compare standard autoencoder topologies' performances for timbre generation. We demonstrate how different activation functions used in the autoencoder's bottleneck distributes a training corpus's embedding. We show that the choice of sigmoid activation in the bottleneck produces a more bounded and uniformly distributed embedding than a leaky rectified linear unit activation. We propose a one-hot encoded chroma feature vector for use in both input augmentation and latent space conditioning. We measure the performance of these networks, and characterize the latent embeddings that arise from the use of this chroma conditioning vector. An open source, real-time timbre synthesis algorithm in Python is outlined and shared.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源