The DeanBeat: Nvidia CEO Jensen Huang Says AI Will Autofill Metaverse 3D Imagery

Interested in knowing what’s next for the gaming industry? Join gaming executives to discuss emerging parts of the industry in October at GamesBeat Summit Next. Register today.


It takes types of AI to create a virtual world. Nvidia CEO Jensen Huang said this week during a Q&A at the GTC22 online event that AI will automatically populate 3D imagery in the metaverse.

He believes that AI will take the first step in creating the 3D objects that populate the vast virtual worlds of the metaverse – and then human creators will pick up the slack and refine them as they see fit. And while that’s a pretty big claim about how smart the AI ​​will be, Nvidia has research to back it up.

Nvidia Research announces this morning that a new AI model can help contribute to the massive virtual worlds created by a growing number of companies and creators could be more easily populated with a wide range of buildings, vehicles, characters in 3D, etc

This kind of mundane imagery represents an enormous amount of tedious work. Nvidia said the real world is full of variety: streets are lined with unique buildings, with different vehicles passing by and diverse crowds passing by. Manually modeling a 3D virtual world that mirrors this is extremely time-consuming, making it difficult to populate a detailed digital environment.

It’s this kind of task that Nvidia wants to facilitate with its Omniverse tools and cloud service. He hopes to make life easier for developers when it comes to building metaverse apps. And self-generated art – as we’ve seen this year with DALL-E and other AI models – is a way to ease the burden of building a universe of virtual worlds like in Snowfall Where Loan player one.

Nvidia CEO Jensen Huang speaking during the GTC22 keynote.

I asked Huang in a Q&A with the press earlier this week what could speed up the metaverse. He hinted at the work of Nvidia Research, although the company hasn’t said anything until today.

“First of all, as you know, the metaverse is created by users. And it’s either created by us by hand or created by us with the help of AI,” Huang said. “And, and in the future, it’s very likely that we’ll describe a feature of a house or a feature of a city or something like that. And it’s like this city, or it’s like Toronto, or it’s like New York, and it creates a new city for us. And maybe we don’t like it. We can give it additional prompts. Or we can just keep pressing “Enter” until it automatically generates one that we would like to start from. And then from that, from this world, we will modify it. And so I think AI to create virtual worlds is happening as we speak.

Details of GET3D

Formed using only 2D images, Nvidia GET3D generates 3D shapes with high-fidelity textures and intricate geometric details. These 3D objects are created in the same format used by popular graphics software applications, allowing users to immediately import their shapes into 3D renderers and game engines for further editing.

Generated objects could be used in 3D representations of buildings, outdoor spaces or entire cities, designed for industries such as games, robotics, architecture and social media.

GET3D can generate a virtually unlimited number of 3D shapes based on the data it is trained on. Like an artist turning a piece of clay into a detailed sculpture, the model transforms numbers into complex 3D shapes.

“At the heart of this is precisely the technology that I was talking about just a second ago, called big language models,” he said. “Being able to learn from all of humanity’s creations, and being able to imagine a 3D world. And so words, through a great pattern of language, will one day come out, triangles, geometry, textures and materials. And then from there, we would modify it. And, and because nothing is pre-baked, and nothing is pre-rendered, all of this physics simulation and all of the light simulation has to be done in real time. And that’s why the latest technologies we’re creating around neuro RTX rendering are so important. Because we can’t do it by brute force. We need the help of artificial intelligence to achieve this.

With a training dataset of 2D car images, for example, it creates a collection of sedans, trucks, race cars, and vans. When trained on animal images, it features creatures such as foxes, rhinos, horses, and bears. Given the chairs, the model generates an assortment of comfortable swivel chairs, dining chairs, and recliners.

“GET3D brings us closer to democratizing AI-powered 3D content creation,” said Sanja Fidler, vice president of AI research at Nvidia and head of the Toronto-based AI lab that created the tool. “Its ability to instantly generate textured 3D shapes could be a game-changer for developers, helping them quickly populate virtual worlds with varied and interesting objects.”

GET3D is one of more than 20 Nvidia-authored papers and workshops accepted into the NeurIPS AI conference, which is taking place in New Orleans and virtually, November 26-December 26. 4.

Nvidia said that although faster than manual methods, earlier 3D generative AI models were limited in the level of detail they could produce. Even recent reverse rendering methods can only generate 3D objects based on 2D images taken from different angles, forcing developers to create one 3D shape at a time.

GET3D can instead produce around 20 shapes per second when running inference on a single Nvidia graphics processing unit (GPU) – functioning as a generative adversarial network for 2D images, while generating 3D objects. The larger and more diverse the training dataset it learned from, the more varied and
detailed output.

Nvidia researchers trained GET3D on synthetic data consisting of 2D images of 3D shapes captured from different camera angles. It took the team just two days to train the model on approximately one million frames using Nvidia A100 Tensor Core GPUs.

GET3D gets its name from its ability to generate explicit textured 3D meshes, which means that the shapes it creates come in the form of a triangular mesh, like a papier-mâché model, covered with a textured material. This allows users to easily import the objects into game engines, 3D modelers and movie renderers – and edit them.

Once creators export the shapes generated by GET3D to a graphics application, they can apply realistic lighting effects as the object moves or rotates in a scene. By incorporating another AI tool from NVIDIA Research, StyleGAN-NADA, developers can use text prompts to add specific styling to an image, such as changing a rendered car to become a burnt-out car or a taxi, or transforming a ordinary house into a haunted one.

The researchers note that a future version of GET3D could use camera pose estimation techniques to allow developers to train the model on real-world data instead of synthetic datasets. It could also be enhanced to support universal generation, meaning developers could train GET3D on all sorts of 3D shapes at once, rather than having to train it on one category of objects at a time. .

Prologue is Brendan Greene's next project.
Prologue is Brendan Greene’s next project.

So the AI ​​will generate worlds, Huang said. These worlds will be simulations, not just animations. And to handle all of this, Huang foresees the need to create a “new kind of data center in the world.” This is called a GDN, not a CDN. It’s a graphics streaming network, battle-tested through Nvidia’s GeForce Now cloud gaming service. Nvidia has taken this service and is using it to build Omniverse Cloud, a suite of tools that can be used to build Omniverse apps anytime, anywhere. The GDN will host cloud games as well as Omniverse Cloud metaverse tools.

This type of network could provide the real-time computation needed by the metaverse.

“It’s an interactivity that’s basically instantaneous,” Huang said.

Are any game developers asking this? Well, actually, I know one that is. Brendan Greene, creator of the Battle Royale game PlayerUnknown’s Productions, asked for this kind of technology this year when he announced Prologue, then revealed Project Artemis, an attempt to create an Earth-sized virtual world. He said it could only be built with a combination of game design, user-generated content, and AI.

Well, damn it.

The GamesBeat creed when covering the video game industry is “where passion meets business”. What does it mean? We want to tell you how much the news means to you, not only as a decision maker in a game studio, but also as a game fan. Whether you read our articles, listen to our podcasts, or watch our videos, GamesBeat will help you learn about and engage with the industry. Discover our Briefings.

Leave a Reply

%d bloggers like this: