Block-NeRF AI recreates a virtual San Francisco neighborhood using 2.8 million photos

There has been a lot of impressive work done in the specialized field of neural radiance fields, otherwise known as NeRF. This technique uses artificial intelligence to create three-dimensional scenes from 2D input images, filling in the possible blanks to create synthetic views of additional parts of the scene. We recently saw Instant NeRF, new technology from NVIDIA Research. Instant NeRF is the fastest NeRF technique so far. If NVIDIA’s work impresses thanks to its speed, a new NeRF approach from Waymo, called Block-NeRF, is equally as impressive thanks to its scalability.

Block-NeRF is a variant of neural radiance fields that can ‘represent large-scale environments.’ The team writes,’ Specifically, we demonstrate that when scaling NeRF to render city-scale scenes spanning multiple blocks, it is vital to decompose the scene into individually trained NeRFs. This decomposition decouples rendering time from scene size, enables rendering to scale to arbitrarily large environments, and allows per-block updates of the environment. We adopt several architectural changes to make NeRF robust to data captured over months under different environmental conditions. We add appearance embeddings, learned pose refinement, and controllable exposure to each individual NeRF, and introduce a procedure for aligning appearance between adjacent NeRFs so that they can be seamlessly combined.’

To showcase Block-NeRF, the team has built a grid of Block-NeRFs using 2.8 million images to create the largest neural scene representation ever. Block-NeRF has rendered an entire neighborhood of San Francisco. In the video below, Károly Zsolnai-Fehér, from the YouTube channel ‘Two Minute Papers,’ breaks down Block-NeRF and the team’s incredible achievement.

Even with 2.8 million images, much of the information seen in the neighborhood flyby is synthetic, created by Waymo’s Block-NeRF AI model. To capture its initial images, Waymo used images from self-driving cars that were captured over several trips during three months. However, the cars didn’t cover every possible path within the neighborhood. Nonetheless, using Block-NeRF, you can veer off the beaten path, and the AI synthesizes new information to populate different pathways with views of the city. The results are extremely impressive.

In the video below, seen on Waymo’s YouTube channel, the team shares supplemental results from Block-NeRF.

While very good, and better than some of what we’ve seen with large-scale AI-generated environments, Waymo’s Block-NeRF isn’t perfect. However, it’s easy to consider possible applications for this AI technology. As DIY Photography points out, ‘this technology could easily be used even today in its current form to reproduce large outdoor locations for virtual sets in a studio – like those from The Mandalorian. As the exterior view through a car window, for example, you might recognize the buildings, but you’re not going to spot the AI artifacts as they’re whizzing by the window at a simulated 50mph. You’re also not going to spot them when they’re acting as the background of a static scene behind live actors, blurred slightly out of focus with a wide aperture lens, either.’