Google's Veo AI Model Turns Images And Text Into High-Quality HD Videos

hero Veo
Google has dropped a private preview version of Veo, its latest generative AI video tool. Generated videos will be in high-definition, about a minute in length, and infused with digital watermarks. In tow, Google's Imagen 3 text-to-image generator has graduated from testing stages to being available for Google Cloud subscribers via Vertex AI. This tool will allow users to edit photos through text prompts plus insert their own company branding and style into generated images.

Google Veo AI example%20(2)
Mountain View has beaten OpenAI to the punch in releasing the market-ready (albeit limited preview) version of Veo, its own take on text-to-video generative AI. The tool is available right now to businesses subscribed to Google Cloud via the Vertex AI Platform. Earlier this year, OpenAI may have made headlines when it pulled the covers off the Sora generative video AI and demoed some ultra-realistic content, but Google has been able to fast-tracked Veo onto the market less than six months after being unveiled at Google I/O developer conference.


Ad created using static images fed into Veo (Credit: Agoda)

At present, Veo is able to produce 1080p resolution videos from static pictures, of which users can set different cinematic and visual elements through text prompts. Google's announcement doesn't specify how long videos can be, but at Google I/O, the company said that it would be "beyond a minute," whatever that means exactly. 

If users so choose, they can feed Veo with images created by Google's latest Imagen 3 text-to-image generator. Google calls the tool the first hyperscaler to offer an image-to-video model, allowing companies to not only edit images via textual prompts, but also infuse said images with brand assets, style, logos, etc. The tool will be open to all Google Cloud subscribers beginning next week. 

In either case, Google assures users that steps have been taken to prevent the tools from creating questionable content or that infringes on copyrights. Moreover, Google will embed all content with digital watermarks using its SynthID tool. 

Based on the samples provided by Google, video and image quality are high enough that it could fool most viewers. A dead giveaway is that all the created videos are in slow motion, but in terms of execution, Veo and Imagen produce content on par with some of the best we've seen so far, such as Sora. If only Coca-Cola had their hands on these tools before it made this monstrosity