The Mistral Small 3.1 artificial intelligence (AI) model made its debut on Monday, introduced by the Paris-based AI firm. This latest iteration includes two open-source variants, chat and instruct, and serves as the successor to the Mistral Small 3. Improved text performance and enhanced multimodal understanding are among the features highlighted by the company, which believes it exceeds competitors like Google’s Gemma 3 and OpenAI’s GPT-4o mini in multiple benchmarks. Notably, the new model is characterized by its swift response times.
Mistral Small 3.1 AI Model Released
Through a recent news release, Mistral provided details about the two newly launched models. The Mistral Small 3.1 features an expanded context window that supports up to 1,280,000 tokens and is capable of delivering inference speeds at 150 tokens per second. This emphasizes the model’s quick response capabilities. It is released in two formats: chat, which functions as a standard chatbot, and instruct, designed to follow specific user commands for application development.
Mistral Small 3.1 benchmark
Photo Credit: Mistral
Consistent with prior releases, the Mistral Small 3.1 is publicly accessible. Users can download the open weights from the firm’s Hugging Face listing. The AI model is distributed under the Apache 2.0 license, permitting academic and research use while prohibiting commercial applications.
Mistral has optimized the large language model (LLM) to operate efficiently on a single Nvidia RTX 4090 GPU or a Mac device equipped with 32GB RAM. This accessibility allows enthusiasts with more modest setups to download and utilize the model. Additionally, the new model incorporates low-latency function calling and execution features beneficial for developing automation and agent-based workflows. Developers are also encouraged to customize the Mistral Small 3.1 to meet the specific needs of specialized fields.
Regarding its performance, the company released benchmark results from internal evaluations, indicating that the Mistral Small 3.1 surpasses both Gemma 3 and GPT-4o mini on various measures, including the Graduate-Level Google-Proof Q&A (GPQA) Main and Diamond benchmarks, HumanEval, MathVista, and DocVQA. However, it is worth noting that GPT-4o mini excelled in the Massive Multitask Language Understanding (MMLU) benchmark, while Gemma 3 outperformed Mistral in the MATH benchmark.
In addition to being available on Hugging Face, the new model can also be accessed via the application programming interface (API) on Mistral AI’s developer platform, La Plateforme, and on Google Cloud’s Vertex AI. It is expected to roll out on Nvidia’s NIM and Microsoft’s Azure AI Foundry in the upcoming weeks.