Researchers at the Max Planck Institute for Informatics and the University of Hong Kong have developed StyleNeRF, a 3D-aware generative model trained on unstructured 2D images that synthesizes high-resolution images with a high level of multi-view consistency.
Compared to existing approaches, which either struggle to synthesize high-resolution images with fine details or produce 3D-inconsistent artifacts, StyleNeRF integrates its neural radiance field (NeRF) into a style-based generator. By employing this approach, StyleNeRF delivers improved render efficiency and better consistency with 3D generation.
StyleNeRF uses volume rendering to produce a low-resolution feature map and progressively applies 2D upsampling to improve quality and produce high-resolution images with fine detail. As part of the full paper, the team outlines a better upsampler (section 3.2 and 3.3) and a new regularization loss (section 3.3).
In the real-time demo video below, you can see that StyleNeRF works very quickly and offers an array of impressive tools. For example, you can adjust the mixing ratio of a pair of images to generate a new mix and adjust the generated image’s pitch, yaw, and field of view.
Compared to alternative 3D generative models, StyleNeRF’s team believes that its model works best when generating images under direct camera control. While GIRAFFE synthesizes with better quality, it also presents 3D inconsistent artifacts, a problem that StyleNeRF promises to overcome. The research states, ‘Compared to the baselines, StyleNeRF achieves the best visual quality with high 3D consistency across views.’
|Table 1 – Quantitative comparisons at 256^2. The team calculated FID, KID x 10^3 and presented the average rendering time for a single batch. The 2D GAN (StyleGAN2) numbers are for reference. Lower FID and KID numbers are better. Click to enlarge.|
If you’d like to learn more about how StyleNeRF works and dig into the algorithms underpinning its impressive performance, be sure to check out the research paper. StyleNeRF is developed by Jiatao Gu, Lingjie Liu, Peng Wang and Christian Theobalt of the Max Planck Institute for Informatics and the University of Hong Kong.
All figures and tables credit: Jiatao Gu, Lingerie Liu, Peng Wang and Christian Theobalt / Max Planck Institute for Informatics and the University of Hong Kong