快速重量近场光度立体声

论文标题

快速重量近场光度立体声

Fast Light-Weight Near-Field Photometric Stereo

论文作者

Lichy, Daniel, Sengupta, Soumyadip, Jacobs, David W.

论文摘要

我们将第一个基于端到端学习的解决方案引入近场光度立体声（PS），其中光源接近感兴趣的对象。该设置对于重建大型固定对象特别有用。我们的方法很快，从52 512 $ \ times $ 384的分辨率图像在商品GPU上大约1秒钟产生的网格，因此有可能解锁多个AR/VR应用程序。现有方法依赖于优化，再加上在像素或小补丁上运行的远场PS网络。使用优化使这些方法仅使用像素或补丁，使这些方法缓慢且内存密集型（需要17GB GPU和27GB的CPU内存），使它们非常容易受到噪声和校准错误的影响。为了解决这些问题，我们开发了一个递归的多分辨率方案，以在每个步骤中估算整个图像的表面正常和深度图。然后，每个尺度上的预测深度图用于估计下一个刻度的“每像素照明”。该设计使我们的方法几乎快45 $ \ times $更快，而2 $^{\ circ} $更准确（11.3 $^{\ circ} $ vs. 13.3 $^{\ circular} $平均角度误差）比最新的近场PS PS重建技术，使用迭代性优化。

We introduce the first end-to-end learning-based solution to near-field Photometric Stereo (PS), where the light sources are close to the object of interest. This setup is especially useful for reconstructing large immobile objects. Our method is fast, producing a mesh from 52 512$\times$384 resolution images in about 1 second on a commodity GPU, thus potentially unlocking several AR/VR applications. Existing approaches rely on optimization coupled with a far-field PS network operating on pixels or small patches. Using optimization makes these approaches slow and memory intensive (requiring 17GB GPU and 27GB of CPU memory) while using only pixels or patches makes them highly susceptible to noise and calibration errors. To address these issues, we develop a recursive multi-resolution scheme to estimate surface normal and depth maps of the whole image at each step. The predicted depth map at each scale is then used to estimate `per-pixel lighting' for the next scale. This design makes our approach almost 45$\times$ faster and 2$^{\circ}$ more accurate (11.3$^{\circ}$ vs. 13.3$^{\circ}$ Mean Angular Error) than the state-of-the-art near-field PS reconstruction technique, which uses iterative optimization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题