Google Launches Gemini Deep Think: AI Model That Evaluates Multiple Ideas Simultaneously for Better Reasoning

Franklin

2 days ago

Google is also deploying the most powerful AI system ever. Dubbed Gemini 2.5 Deep Think, the new system will analyze problems with numerous ideas simultaneously while reasoning out solutions. It is a significant improvement in multi-agent architecture, an emerging AI development pattern. Google subscribers will have access to subscriptions on its $250-per-month Ultra tier starting Friday.

A New Benchmark for AI Reasoning

At Google I/O 2025, Google released its commercially available first multi-agent model, Gemini 2.5 Deep Think. In contrast, Deep Think implements several AI agents working in parallel rather than a single one used in traditional models to handle queries. Such a parallel solution is much more compute-intensive but seeks to provide more and better responses.

Google suggested that Gemini 2.5 Deep Think can scan through multiple lines of mental thinking simultaneously but later able to condense the best possible solution. According to the company, the model was instrumental in solving the problem that involved being creative, having a strategy, and improvement through iteration.

Exclusive Access via Ultra Subscription

Starting this week, the model will be available exclusively through the Gemini app for users subscribed to Google’s Ultra plan. At $250 per month, the premium tier reflects the high computational cost of serving multi-agent AI. Google confirmed that the cost structure ensures only high-value use cases gain direct access to the system.

Its limited release method reflects the actions of other xAI and OpenAI, which have also kept their most developed multi-agent systems at high-level or in-house research.

Gold Medal Performance at the IMO

A modified version of Gemini 2.5 Deep Think was used to compete in the 2025 International Math Olympiad. Google reported that the AI secured a gold medal in the global competition. The achievement demonstrates the system’s capacity to handle high-stakes mathematical reasoning.

Alongside the rollout, Google is releasing the Olympiad-specific variant of the model to a select group of academics and mathematicians. This research-grade system differs from consumer models by taking hours—not seconds or minutes—to generate outputs. The goal is to support long-form academic reasoning and gather feedback to improve future iterations.

Deep Reinforcement Learning Powers Enhanced Reasoning

Google added that Gemini 2.5 Deep Think uses new reinforcement learning techniques. These improvements train the system to use its numerous reasoning routes before settling on a definitive answer. The updates will improve the model’s functionality to help it work through complex or abstract queries step-by-step.

According to a blog post shared with TechCrunch, Google said Deep Think is built to “help people tackle problems that require creativity, strategic planning, and incremental improvements.”

Record Scores on Benchmark Tests

Gemini 2.5 Deep Think has also shown strong performance across multiple AI benchmark tests. Google claims it achieved a score of 34.8% on Humanity’s Last Exam (HLE), a rigorous test that measures how well AI systems handle diverse human knowledge. For comparison, xAI’s Grok 4 scored 25.4%, while OpenAI’s o3 model achieved 20.3%.

On LiveCodeBench 6, which tests AI coding ability under competitive conditions, Deep Think scored 87.6%. This performance surpassed Grok 4’s 79% and OpenAI o3’s 72%, placing Google’s model at the top of the current AI coding leaderboard.

Integrated Tool Use and Longer Outputs

Unlike earlier models, Gemini 2.5 Deep Think is optimized for tool use. It automatically integrates with Google Search and code execution tools, allowing the system to fetch real-time information and test code within a single workflow.

Google also reports that the model can produce longer, more detailed responses. In internal testing, it delivered high-quality web development outputs, including design and functionality suggestions, that outpaced those of other leading AI models. The company believes this strength can support innovation in both academic and commercial settings.

Multi-Agent Systems Become the Industry Standard

Several major AI labs are now betting on multi-agent systems as the next phase of large language model evolution. Google’s Deep Think joins a growing list that includes xAI’s Grok 4 Heavy and OpenAI’s Olympiad-winning internal model, both of which use multi-agent architecture.

It has followed this line originally taken by Anthropic as well. Its research agent uses a multi-agent-like design to develop comprehensive research briefs. The shift represents a larger trend in the lab as labs transition to collaborative, parallel modes of thinking that can reason more deeply and accurately.

High Costs Limit Access to Enterprise and Research Use

While multi-agent systems offer significant advantages, they come with high infrastructure demands. According to Google, serving Deep Think requires considerably more computational resources than standard models. This cost factor is likely why both Google and xAI are placing these models behind their most expensive subscription tiers.

Google pointed out that this strategy is to focus on applications that have high impact. The company will limit access to enterprise users, researchers, and developers in order to obtain focused feedback and minimize a load on compute resources.

Gemini API Access for Select Testers

In addition to app-based access, Google plans to extend Gemini 2.5 Deep Think to developers and enterprises through the Gemini API. The company stated it would roll out API access to a small group of testers in the coming weeks.

Google aims to understand better how its multi-agent model performs in varied real-world scenarios. Feedback from enterprise users is expected to inform how future versions of Deep Think evolve to support software development, academic research, and high-stakes planning tasks.

FAQs

What is Gemini 2.5 Deep Think?

Gemini 2.5 Deep Think is Google’s most advanced AI reasoning model. It uses a multi-agent architecture to answer complex questions by generating and evaluating multiple ideas in parallel before selecting the best response.

How can I access Gemini 2.5 Deep Think?

Access is currently limited to subscribers of Google’s $250-per-month Ultra plan through the Gemini app. In the coming weeks, Google plans to offer API access to a select group of developers and enterprise testers.

What makes Gemini 2.5 Deep Think different from other AI models?

Unlike single-agent models, Gemini 2.5 Deep Think uses multiple agents working simultaneously. This approach allows it to reason more deeply, solve complex tasks, and deliver more accurate and detailed answers across coding, math, and creative domains.

Has Gemini 2.5 Deep Think been tested in real-world scenarios?

Yes. A variation of the model earned a gold medal at the 2025 International Math Olympiad. It also achieved top scores on AI benchmarks like Humanity’s Last Exam and LiveCodeBench 6, outperforming models from OpenAI, xAI, and Anthropic.

Why is Gemini 2.5 Deep Think so expensive to use?

The model’s multi-agent structure requires significantly more computational resources than traditional models. This higher cost is why it’s currently available only through premium subscription plans and limited research deployments.