论文标题
点-E:一个用于从复杂提示生成3D点云的系统
Point-E: A System for Generating 3D Point Clouds from Complex Prompts
论文作者
论文摘要
虽然最近在文本条件3D对象生成的工作显示出令人鼓舞的结果,但最新的方法通常需要多个GPU小时才能产生单个样本。这与最新的生成图像模型形成鲜明对比,后者在几秒钟或分钟内生成样品。在本文中,我们探讨了3D对象生成的另一种方法,该方法仅在单个GPU上仅1-2分钟就产生3D模型。我们的方法首先使用文本对图扩散模型生成单个合成视图,然后使用第二个扩散模型生成3D点云,该模型在生成的图像上条件。尽管我们的方法在样本质量方面仍然没有最先进的方法,但从样本质量方面的样本速度更快,在某些用例中提供了实际权衡的速度。我们在https://github.com/openai/point-e发布了预训练的点云扩散模型以及评估代码和模型。
While recent work on text-conditional 3D object generation has shown promising results, the state-of-the-art methods typically require multiple GPU-hours to produce a single sample. This is in stark contrast to state-of-the-art generative image models, which produce samples in a number of seconds or minutes. In this paper, we explore an alternative method for 3D object generation which produces 3D models in only 1-2 minutes on a single GPU. Our method first generates a single synthetic view using a text-to-image diffusion model, and then produces a 3D point cloud using a second diffusion model which conditions on the generated image. While our method still falls short of the state-of-the-art in terms of sample quality, it is one to two orders of magnitude faster to sample from, offering a practical trade-off for some use cases. We release our pre-trained point cloud diffusion models, as well as evaluation code and models, at https://github.com/openai/point-e.