On Wednesday, Google unveiled its latest artificial intelligence (AI) offering, the Gemini 2.5 Pro Experimental model, marking the debut of the 2.5 family of AI models. The Mountain View-based tech firm asserts that this new series incorporates inherent reasoning capabilities directly into the model architecture, which is expected to enhance its performance across various tasks. According to the company, it has achieved superior benchmark scores in numerous areas, surpassing OpenAI’s o3-mini in multiple evaluations. Users have begun to receive access to the new model.
Introduction of the Gemini 2.5 Pro AI Model
In a blog announcement, Koray Kavukcuoglu, CTO of Google DeepMind, provided insights into the new large language model (LLM). A significant change in the Gemini 2.5 series is the elimination of distinct “Thinking” models, such as the Gemini 2.0 Flash Thinking, which were previously labeled as such.
Google has developed an enhanced base model that has undergone additional improvements during post-training, enabling all Gemini 2.5 models to possess advanced reasoning abilities. Consequently, each model in this series will be equipped for intricate reasoning tasks without a specific “Thinking” label.
Benchmarks for Gemini 2.5 Pro
Photo Credit: Google
While Google has kept specifics regarding the model’s dataset, training methodologies, and architecture somewhat vague, it has provided benchmark scores from its internal tests. Notably, Gemini 2.5 Pro reportedly achieved an 18.8 percent score on Humanity’s Last Exam, which is recognized as one of the most challenging assessments for AI models. This score is regarded as state-of-the-art (SOTA) among models that do not utilize external tools.
Additionally, Gemini 2.5 Pro has demonstrated superior performance when compared to models such as OpenAI’s o3-mini, Grok 3 Beta, Claude 3.7 Sonnet, and DeepSeek R1 across various benchmarks, including GPQA Diamond and AIME for 2024 and 2025.
Moreover, Gemini 2.5 Pro has claimed the top spot on the LMArena leaderboard at launch. This platform allows AI developers and enthusiasts to rate models based on their experiences and is currently followed by Grok 3 preview, GPT 4.5 preview, Gemini 2.0 Flash Thinking, and Gemini 2.0 Pro in the subsequent rankings.
Google also asserts that the new LLM enhances coding capabilities, enabling it to produce “visually compelling” web applications and effective coding solutions. The Gemini 2.5 Pro includes native multimodal support and boasts a context window accommodating up to one million tokens.
Developers and enterprises can access Gemini 2.5 Pro through Google AI Studio, and those subscribed to Gemini Advanced can utilize the model on Gemini’s web client and applications. The company anticipates making it available on Vertex AI in the near future.