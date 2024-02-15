The name "Sora" means "sky", "vagueness", or "falsehood" in Japanese. It's a fitting appellation for the new video generation model, which OpenAI has shown to be able to generate incredible video clips from just a simple description. The image in the top of this post is one frame from a shockingly convincing video that was created using the prompt "A Chinese Lunar New Year celebration video with Chinese Dragon."

Announcing Sora — our model which creates minute-long videos from a text prompt: https://t.co/SZ3OxPnxwz pic.twitter.com/0kzXTqK9bG — Greg Brockman (@gdb) February 15, 2024





One frame from an animation of a "zen garden gnome".

Apparently, the way Sora is able to achieve such lifelike output is because it "understands not only what the user has asked for in the prompt, but also how those things exist in the physical world." In other words, it's not simply drawing on a training set of purely visual data, but it has a deeper understanding of the content of the videos that is generating, more like a human would.





This cat trying to rouse its owner has too many feet. It's still cute though!

So we should never trust another video clip we see ever again, right? Well, much as it has done with its ChatGPT and DALL-E models (for text and images respectively), OpenAI plans to embed C2PA metadata into all videos generated by Sora that clearly mark the video as AI-generated. C2PA metadata is supposed to be protected against meddling (such as by folks who would want to strip it), but it's a new technology and we'll see how that goes.