NVIDIA GauGAN2 AI Creates Stunning Photorealistic Artwork From Simple Human Phrases

Images generated by NVIDIA's GauGAN2 AI
Images generated by NVIDIA's GauGAN2 AI

Visual imagination works differently for every person. Some people, with a condition called "aphantasia," aren't able to generate mental pictures at all. Others have the imagery come to mind first, and then describe it with words. Still others, like this HH contributor, think of things in terms of language and the mental images come from that.

As it turns out, I have this trait in common with NVIDIA's GauGAN2 AI, but perhaps one of us might be better at it than the other. Big Green has created an application that interfaces with a "generative adversarial neural network" called GauGAN2 (a follow-up to its original GauGAN AI) to create editable, refine-able images based on text prompts. The idea is that creators can use the tool to generate a starting point by entering a text prompt and selecting a "theme," and then edit the image to get the results closer to the user's original intention.

NVIDIA's video above shows the tool working in real-time to adapt to text input, but that's not how the toy in NVIDIA's "AI playground" works. Instead, after you provide your text prompt, the tool invites you to copy it over to the editable side of the window where you can define areas as parts of buildings, ground, landscape features, or plants. You can segment the image to help the AI better understand what each area is, and you can erase parts of the image to let the AI try and re-create them in a different way.

image editor interface
The interface for the GauGAN2 image generator. (click for big)

combined image results
The original generated image (left) and the cleaned-up version (right). (click for big)

It's certainly a cool tool, but it's not quite as easy to use as NVIDIA's video makes it out to be. It took playing around with it for a half-hour or so before we really began to understand how it works. Before that, the AI generated a number of quite strange and even perhaps disturbing images. Despite NVIDIA saying that the network was trained on "10 million high-quality landscape images," apparently some number of the images included humans, vehicles, and perhaps even interiors, as the neural network is quite apt to generate structures with recognizable man-made components.

If you'd like to try your hand at creating simulated photos of places that never existed, head over to NVIDIA's AI Playground and click on the "Launch Interactive Demo" button for GauGAN2.