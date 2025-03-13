Yesterday, Google DeepMind unveiled two new AI models -- Gemini Robotics and Gemini Robotics-ER -- to advance robotic intelligence in multiple markets. According to Google, these models, built on the Gemini multimodal foundation, enable robots to process and respond to text, voice, and visual data making it possible for them to carry out complex physical tasks.
One of the new models, Gemini Robotics, is reportedly Google's most "advanced vision language action model", which will enable robots to be more generalized, interactive and dexterous. Highlighting these key areas of improvement, the head of robotics at DeepMind, Carolina Parada, noted that generalization allows robots to adapt and apply learned concepts in various situations that they may encounter. Being interactive requires them to decode and quickly respond to changes in the environment. At the same time, the ability to show dexterity implies that they are skilful in handling tasks just as humans would.
In a video, Google showcases the responsive and organizational nature of the AI model. For example, the robot is seen doing several physical tasks, such as picking fruits and snacks and placing them in containers and bags. Something that viewers may find more impressive, however, is the robot's ability to detect each time a user has changed the position of the containers and bags.
The second model, Gemini Robotics-ER, uses "embodied reasoning." It tries to mimic the intuitive understanding humans develop through life's experience. This in turn allows Gemini's robots to make educated guesses about unfamiliar objects. For instance, the model identified a coffee cup handle as the right way to raise it, which shows human-like reasoning.
While this advancement is impressive, DeepMind has also put safety measures in place. The company's head of robotic safety, Vikas Sindhwani, revealed an evaluation system that ensures that the actions of robots align with common sense safety principles. The Gemini Robotics-ER model seems to be able to identify potentially harmful situations, and scored over 80% on DeepMind's Asimov benchmark, which tests AI models on real-world safety scenarios.
Google has reiterated its desire to see robots take on a more important role in helping humans carry out complex physical tasks. The company disclosed that it is partnering with Apptronik, among other robotics visionaries, to further develop the capabilities of its AI models and improve their capabilities, performance, and accuracy.