Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Waymo Releases Block-NeRF 3D View Synthesis Deep-Learning Model

Waymo Releases Block-NeRF 3D View Synthesis Deep-Learning Model

This item in japanese

Waymo released a ground-breaking deep-learning model called Block-NeRF for large-scale 3D world-view synthesis reconstructed from images collected by its self-driving cars.

Generating virtual worlds in 3D using images has been a classic computer-vision research topic for a long time. Since 2020 a new approach called neural radiance fields (NeRF) has become a hot topic of research and has become the state of the art in generating novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views, mostly images.

NeRF has the ability to encode surface and volume representation in neural networks. This is important because one can provide to the network a sample of sparse images and the model can predict other views for the same rendered scene.

Image source

Block-NeRF is used to train multiple NeRFs and combine the output into one large scene, i.e., combine multiple scenes under different light conditions and overlap and reconstruct them into one large-scale scene. This new model was trained on 2.8 million images containing multiple light and weather conditions collected by Waymo cars over three months.
The Embarcadero roadway from a 180-degree view stance in Block-NeRF
Image source 

Block-NeRF has the potential to simulate large virtual worlds from a limited collection of sample images. It is important to notice that this model is able to generate more views than those recorded from the perspectives in the data. This can be really useful for autonomous driving and aerial space rendering.

Another important addition from this framework to NeRF is modularity and scalability, which means that one can add another recorded scene to the training data and increase the size from previous generated virtual worlds.

One of the components the authors see as future work is the automation of a geographic filter constraint associated with each block of the generated scene.

This breakthrough generated a lot of social media buzz on Reddit and Twitter:

Can't wait for Google to upgrade Street View in Maps to Block-NeRF. :)   by spaceco1n

@elonmusk: I predicted this a couple of years ago, we can build an open-world photorealistic driving game from all the Tesla data!  By Geffen Avraham


About the Author

Rate this Article