Google Brain Takes ‘Zoom, Enhance’ From Hollywood Script To Improve Real Life Photos

Neural Network
If you’ve watched enough CSI (or just about any other crime show or film), then you are no doubt familiar with the phrase “zoom, enhance” or its countless variations. Detectives often will have a digital image that contains information that they need a closer look at. Nothing is sacred — amazingly zooming in on a far-off license plate, zeroing in on the reflection of a killer in someone’s eyeball, or identifying a suspect via a reflection in a pool of water. Enhance, enhance, enhance!

This Hollywood trope is clearly overused on both the small and big screens, despite the fact that is woefully inaccurate and is beyond the scope of modern image processing. Or is it? The folks over at Google Brain are working on technology that would make such digital enhancement close to reality.

Google is able to input an 8x8 pixel image, after which two separate neural networks work together to come up with a close approximation of what it “thinks” the resulting image should look like. Google calls this “pixel recursive super resolution” and the results are quite impressive, if not always completely accurate.

Take the example below. On the right is an actual 32x32 pixel image of a celebrity. On the left is that image reduced down to an 8x8 pixel image. The picture in the middle is what Google Brain surmised how the person actually looks.

Google Brain

So how is Google Brain able to achieve this wizardry? First of all, Google Brain uses a conditioning network which matches the 8x8 source image with higher resolution images. It then down samples those images to see if can find a close match for the source. Through the use of PixelCNN, the second part of the puzzle relies on a prior network. Using high resolution images of celebrities, the prior network tries to identify common features of a human face (for example, the positioning of eyes or a mouth) to help fill in any gaps that might have occurred using the conditioning network.

With that work done, the images from the pair of neural networks are combined to give us the final image.

Google Brain 2

The full results of the research can be found in this paper [PDF], and you can view Google Brain’s results when using two different image categories: celebrity pictures and bedrooms. The implications of such technology are quite astounding. Imagine police departments using this technology (just like on TV shows) to get a more better representation of a suspect in a criminal case? The results might not be perfect as it is in Hollywood, but it could help to narrow down the list of suspects.