论文标题
用AVX-512转码Unicode字符说明
Transcoding Unicode Characters with AVX-512 Instructions
论文作者
论文摘要
英特尔在其最近的处理器中包括一套强大的指令,能够使用单个指令(AVX-512)处理512位寄存器。其中一些说明在早期的指令集中没有等效。我们利用这些说明有效地在最常见的格式之间进行有效反编码字符串:UTF-8和UTF-16。借助我们的新算法,我们的速度通常是以前最佳解决方案的两倍。例如,我们使用每个字符少于2个CPU指令以超过5 GIB/s的速度将中文文本从UTF-8转到UTF-16。为了确保可重复性,我们将软件作为开源库免费提供。我们的库是流行节点的一部分。JSJavaScript运行时。
Intel includes in its recent processors a powerful set of instructions capable of processing 512-bit registers with a single instruction (AVX-512). Some of these instructions have no equivalent in earlier instruction sets. We leverage these instructions to efficiently transcode strings between the most common formats: UTF-8 and UTF-16. With our novel algorithms, we are often twice as fast as the previous best solutions. For example, we transcode Chinese text from UTF-8 to UTF-16 at more than 5 GiB/s using fewer than 2 CPU instructions per character. To ensure reproducibility, we make our software freely available as an open source library. Our library is part of the popular Node.js JavaScript runtime.