The drawing bot is capable of generating images using caption-like text descriptions as the input. Using this technique, Microsoft's researchers were able to achieve a three-fold uplift in image quality compared to previous text-to-image methods. When drawing an image, the bot is even capable of imagining details that were not originally listed in the text, and relies on the bot's "artificial imagination" to fill in the blanks.
“If you go to Bing and you search for a bird, you get a bird picture. But here, the pictures are created by the computer, pixel by pixel, from scratch,” said Xiaodong He, research manager at Microsoft's Deep Learning Technology Center. “These birds may not exist in the real world — they are just an aspect of our computer’s imagination of birds.”
Using the GAN, the drawing bot calls upon two machine learning models. The first one is capable of drawing images based on the text descriptions, while the second one -- the discriminator -- determines the authenticity of the generated images. "Working together, the discriminator pushes the generator toward perfection," adds Microsoft.
Given that the drawing bot was fed an enormous amount of data for training purposes, it came up with its own preconceived notions about what's right and wrong. For example, the AI knows that birds are often rest in trees, perhaps perching on a branch. For that reason, when the drawing bot it tasked with drawing it bird, it most often has the bird gripping a tree branch because that is what it has learned.
Feeding the drawing bot information that runs counter to its training can lead to some interesting results. "The team fed the drawing bot captions for absurd images, such as 'a red double-decker bus is floating on a lake'," writes Microsoft. "It generated a blurry, drippy image that resembles both a boat with two decks and a double-decker bus on a lake surrounded by mountains. The image suggests the bot had an internal struggle between knowing that boats float on lakes and the text specification of bus."
So, what are the practical applications of the drawing bot? Microsoft envisions that the technology could be enhanced with voice-activation/recognition, and could be used as a sketch assistant for painters and interior decorators. Looking towards more of a "big picture" idea, Microsoft says that the drawing bot could create animated films simply by using a screenplay as an input source.