Google Confirms Its Awesome Gemini AI Demo Was Staged And Explains Why
As it turns out, the actual interaction with Gemini did not include any voice interaction, nor was it happening in real time. The video was actually made by “using still image frames from the footage, and prompting via text.” This has led some to question just how advanced Gemini actually is, and calling into question Google’s integrity on the matter.
Google, however, does not recognize that it did anything wrong. The company points those questioning the demo to an X/Twitter post made by Gemini’s co-lead, Oriol Vinyals, in which he points out that “all prompts and outputs in the video are real,” and that the video was simply made to “inspire developers.”
In fairness to Google, the video does have a disclaimer attached that reads, “For the purpose of this demo, latency has been reduced, and Gemini outputs have been shortened for brevity.” However, this still includes nothing about the misconception that voice prompts were being used instead of text prompts.
In an op-ed on the situation, Bloomberg writer Parmy Olson remarked, “That’s quite different from what Google seemed to be suggesting: that a person could have a smooth voice conversation with Gemini as it watched and responded in real-time to the world around it.”
For those curious to how the Google Gemini interaction actually occurred, Google posted a “How its Made: Interacting with Gemini through multimodal prompting” post on its website.