点-E：一个用于从复杂提示生成3D点云的系统

论文标题

点-E：一个用于从复杂提示生成3D点云的系统

Point-E: A System for Generating 3D Point Clouds from Complex Prompts

论文作者

Nichol, Alex, Jun, Heewoo, Dhariwal, Prafulla, Mishkin, Pamela, Chen, Mark

论文摘要

虽然最近在文本条件3D对象生成的工作显示出令人鼓舞的结果，但最新的方法通常需要多个GPU小时才能产生单个样本。这与最新的生成图像模型形成鲜明对比，后者在几秒钟或分钟内生成样品。在本文中，我们探讨了3D对象生成的另一种方法，该方法仅在单个GPU上仅1-2分钟就产生3D模型。我们的方法首先使用文本对图扩散模型生成单个合成视图，然后使用第二个扩散模型生成3D点云，该模型在生成的图像上条件。尽管我们的方法在样本质量方面仍然没有最先进的方法，但从样本质量方面的样本速度更快，在某些用例中提供了实际权衡的速度。我们在https://github.com/openai/point-e发布了预训练的点云扩散模型以及评估代码和模型。

While recent work on text-conditional 3D object generation has shown promising results, the state-of-the-art methods typically require multiple GPU-hours to produce a single sample. This is in stark contrast to state-of-the-art generative image models, which produce samples in a number of seconds or minutes. In this paper, we explore an alternative method for 3D object generation which produces 3D models in only 1-2 minutes on a single GPU. Our method first generates a single synthetic view using a text-to-image diffusion model, and then produces a 3D point cloud using a second diffusion model which conditions on the generated image. While our method still falls short of the state-of-the-art in terms of sample quality, it is one to two orders of magnitude faster to sample from, offering a practical trade-off for some use cases. We release our pre-trained point cloud diffusion models, as well as evaluation code and models, at https://github.com/openai/point-e.

下载PDF全文

下载文献需遵守相关版权规定

论文标题