On Tuesday, Google officially launched the Gemini 2.5 family of artificial intelligence (AI) models, making the stable versions of the Gemini 2.5 Pro and Gemini 2.5 Flash models accessible to users of its chatbot. Notably, the tech company from Mountain View has extended availability of the Pro model to users on the free tier of the Gemini platform. Furthermore, Google has unveiled a new 2.5 Flash-Lite model, touted to be the company’s fastest and most cost-effective AI solution to date.
Gemini 2.5 Pro Now Accessible to All Users
In a recent blog post, Google announced the rollout of the stable versions of the Gemini 2.5 Pro and Flash models, which were previously available only in a preview mode. During the preview phase, users experienced limitations in utilizing the complete range of capabilities, and the models were often subjected to glitches and errors that are expected to be resolved in this stable release.
While users subscribed to Google AI Pro and Ultra plans will continue to have access to the Gemini 2.5 Pro model, the free-tier users can now utilize it as well. However, free users will likely face a lower daily prompt limit compared to paid subscribers. For instance, Google AI Pro users receive 100 daily prompts, while Ultra users enjoy even greater access. It’s important to note that this latest version of the Pro model aligns closely with the one introduced earlier this month, with no significant changes implemented.
The update signifies that the model selector menu on both the Gemini website and app will no longer display the preview versions of these models. Users on the free tier will now see options for the Gemini 2.5 Flash, Gemini 2.5 Pro, and the Personalisation Preview model, which leverages users’ Google Search histories to deliver tailored responses.
Additionally, the tech giant has rolled out the Gemini 2.5 Flash-Lite model, positioned to outperform its predecessor, the 2.0 Flash-Lite, particularly in tasks related to coding, mathematics, science, reasoning, and multimodal functions. This low-latency model is designed for real-time applications such as translation and classification. It includes features from the 2.5 family, including varying token budgets for reasoning, integration with Google Search and code execution tools, support for multimodal inputs, and a context window capable of handling one million tokens.
The Gemini 2.5 Flash-Lite model is presently available through Google AI Studio and Vertex AI, which are also hosting the stable versions of the 2.5 Pro and Flash models. In addition, Google is in the process of integrating the 2.5 Flash-Lite and Flash models into its Search functionalities.