Google DeepMind has unveiled Gemini 2.5 Deep Think, a breakthrough AI inference system that employs parallel exploration of multiple problem-solving approaches to select optimal solutions. This multi-agent architecture generates several AI agents simultaneously to address complex queries, often yielding superior results despite increased computational requirements.
Subscribers to Gemini Ultra's $250-per-month plan will gain access to this advanced model in the Gemini app starting this Friday.
First introduced at May 2025's Google I/O event as the company's first publicly available multi-agent system, Gemini 2.5 Deep Think demonstrated exceptional performance by securing a gold medal in this year's International Mathematical Olympiad (IMO). The model variant used for IMO required hours of reasoning - significantly longer than consumer-grade AI models that typically operate within seconds or minutes.
Google is extending access to its IMO-winning model through a curated group of mathematicians and researchers, seeking academic feedback to refine multi-agent systems. The company has developed innovative reinforcement learning techniques to enhance the model's reasoning pathways since its initial I/O announcement.
"Deep Think empowers users to tackle problems demanding creativity, strategic planning, and iterative refinement," states Google in a blog post shared with TechCrunch. The model achieved state-of-the-art results on the Human Final Exam (HLE), scoring 34.8% without external tools - outperforming xAI's Grok 4 (25.4%) and OpenAI's o3 (20.3%).
In competitive coding benchmarks like LiveCodeBench6, Gemini 2.5 Deep Think secured 87.6%, surpassing Grok 4's 79% and OpenAI's o3 at 72%. The system integrates seamlessly with code execution and Google search tools, generating longer, more detailed responses than conventional AI models. In web development tasks, it produces more aesthetically refined solutions according to internal testing.
The industry is witnessing growing adoption of multi-agent approaches. Elon Musk's xAI recently launched Grok 4 Heavy, while OpenAI researcher Noam Brown revealed their IMO-winning model also uses multi-agent architecture. Anthropic's research agents similarly rely on these systems for comprehensive analysis.
Despite strong performance metrics, multi-agent systems require higher operational costs, leading companies like Google and xAI to restrict access to premium subscription tiers. In the coming weeks, Google plans to share Gemini 2.5 Deep Think via its Gemini API with select testers to evaluate enterprise applications.