Google's Latest AI Model Enhances Robotic Intelligence and Capabilities

2025-09-26

Google's DeepMind research division has announced significant upgrades to multiple AI models designed to enhance robotic capabilities. These updates enable robots to perform more complex, multi-step tasks and even conduct online searches to gather information for completing assignments.

The newly released models include Gemini Robotics 1.5, which controls robotic actions, and Gemini Robotics-ER 1.5, an embodied reasoning model that supports robotic decision-making processes.

Initially launched in March, these models were limited to single tasks such as zipping up a jacket or folding paper. Now, they can handle more sophisticated operations. For example, they can sort laundry by color and pack suitcases with weather-appropriate clothing based on forecasts for London or New York. According to DeepMind, this involves searching online for the latest weather updates. The models can also gather other information via the internet to complete tasks like sorting recyclables according to local waste management guidelines.

In a blog post, Carolina Parada, head of the DeepMind Robotics division, explained that these models will help developers build more capable and versatile robots capable of proactively understanding their surroundings.

During a press briefing, Parada added that the two models work together, allowing robots to plan multiple steps ahead before taking action. She noted, "Previous models could execute a single instruction in a fairly general way. With this update, we're moving from just following instructions to truly understanding and solving physical tasks."

Gemini Robotics 1.5 and Gemini Robotics-ER 1.5 are referred to as "vision-language-action" (VLA) models, although they serve different purposes. The former translates visual inputs and instructions into motion commands, enabling robots to carry out tasks. It plans actions in advance and displays this planning process, helping robots evaluate and complete complex tasks more efficiently.

Gemini Robotics-ER 1.5, on the other hand, is designed to reason about the physical environment it operates in. It can use digital tools like web browsers to create detailed, multi-step plans to accomplish specific missions. Once the plan is ready, it passes it to Gemini Robotics 1.5 for execution.

Parada highlighted that these models can "learn" from each other even when deployed on different robots. During testing, DeepMind found that tasks assigned to the ALOHA2 robot, which has two robotic arms, were later executed equally well by the dual-arm Franka robot and Apptronik's humanoid robot Apollo.

"This gives us two key benefits," Parada explained. "First, a single model can control very different robots, including a humanoid. Second, skills learned on one robot can now be transferred to another."

Google stated that Gemini Robotics-ER 1.5 is available via the Gemini API on Google AI Studio, a platform for building, fine-tuning, and integrating AI models with applications. Developers can access resources to begin creating robotic AI applications.

In contrast, Gemini Robotics 1.5 is more exclusive and currently accessible only to "select partners," according to Parada.