At the Google I/O 2025 event on Tuesday, Google unveiled a range of new features for its Gemini 2.5 family of artificial intelligence (AI) models. The Mountain View-based company introduced an advanced reasoning mode called Deep Think, powered by the Gemini 2.5 Pro model. In addition, a new feature named Native Audio Output was launched, which enables a more natural and human-like speech generation through the Live application programming interface (API). The latest updates also include thought summaries and thinking budgets designed for developers.
Gemini 2.5 Pro Ranks at the Top of the LMArena Leaderboard
The technology giant provided details on its new capabilities in a blog post, outlining features that will be integrated into the Gemini 2.5 AI model series over the coming months. Earlier in the month, Google launched an updated version of the Gemini 2.5 Pro, which exhibited enhanced coding capabilities and secured the top rank on both the WebDev Arena and LMArena leaderboards.
The introduction of the Deep Think mode further enhances the capabilities of the Gemini 2.5 Pro. This innovative reasoning mode enables the model to evaluate multiple hypotheses before generating a response, utilizing a distinct research approach compared to previous iterations of the model.
According to internal tests, Google shared benchmark scores for the Deep Think mode across various parameters. The Gemini 2.5 Pro is reported to achieve a score of 49.4 percent on the 2025 UAMO, one of the most challenging mathematics benchmark tests, and scores competitively on LiveCodeBench v6 and MMMU.
Currently, Deep Think is undergoing testing, with Google conducting safety evaluations and gathering feedback from experts in the field. Access to the reasoning mode is currently limited to trusted testers through the Gemini API, with no specific release date announced.
Additions to the Gemini 2.5 Flash model, which was released just a month ago, were also highlighted by Google. The company indicated that significant improvements have been made in reasoning, multimodality, coding, and handling long contexts. Furthermore, the model is now more efficient, reportedly using 20-30 percent fewer tokens compared to its predecessor.
This updated version of Gemini 2.5 Flash is currently available in preview for developers via Google AI Studio, while enterprises can access it through the Vertex AI platform, and individuals can find it within the Gemini app. A broader release for production is planned for June.
Developers utilizing the Live API will also gain access to a new feature with the Gemini 2.5 AI models—Native Audio Output. This feature, now in its preview phase, enables the generation of speech that is more expressive and human-like. Customization options allow users to control tone, accent, and speech style.
The initial version of this feature includes three components: Affective Dialogue, which allows the AI model to recognize emotions in the user’s voice and respond appropriately; Proactive Audio, enabling the model to ignore background noises and respond only when addressed; and Thinking, which employs Gemini’s reasoning capabilities to articulate responses to complex inquiries verbally.
Additionally, the Gemini API and Vertex AI will now feature thought summaries for the 2.5 Pro and Flash models. These summaries outline the model’s raw thought process, which was previously limited to Gemini’s reasoning models. The new summaries will include structured details such as headers, key information, and descriptions of the model’s actions with every response.
In the upcoming weeks, developers will also be able to utilize thinking budgets with the Gemini 2.5 Pro, allowing them to specify the number of tokens consumed before a model responds. Soon, the Computer Use agentic function from Project Mariner will also be integrated into the API and Vertex AI.