1. News
  2. INTERNET
  3. Alibaba Unveils Powerful Qwen 2.5-VL-32B AI Model

Alibaba Unveils Powerful Qwen 2.5-VL-32B AI Model

featured
Share

Share This Post

or copy the link

On Monday, Alibaba’s Qwen team unveiled a new addition to the Qwen 2.5 family of artificial intelligence (AI) models. The latest iteration, known as Qwen 2.5-VL-32B Instruct, boasts enhanced performance and optimizations. This vision language model features 32 billion parameters and complements existing models of three billion, seven billion, and 72 billion parameters, while maintaining its status as an open-source AI model available under a permissive license.

Alibaba Releases Qwen 2.5-VL-32B AI Model

In an official blog post, the Qwen team elaborated on the capabilities of their new vision language model (VLM). The Qwen 2.5-VL-32B model is designed to outperform its predecessors, the 3B and 7B models, while being more compact than the foundational 72B model. Previous versions of this large language model (LLM) demonstrated superiority over DeepSeek-V3, and the 32B model is reported to exceed the performance of similar systems from Google and Mistral.

The Qwen 2.5-VL-32B Instruct features an enhanced output style that produces more intricate and well-structured responses. Researchers assert that its replies are more closely aligned with human expectations. Improvements in mathematical reasoning capabilities allow the AI model to tackle more challenging problems effectively.

Furthermore, enhancements have been made to its image recognition capabilities and reasoning analysis, which now include better image parsing, content identification, and visual logic deduction.

qwen25vl benchmark Qwen 2 5 VL 32B Instruct

Qwen 2.5-VL-32B-Instruct
Photo Credit: Qwen

Based on internal evaluations, the Qwen 2.5-VL-32B reportedly outperformed competing models including Mistral-Small-3.1-24B and Google’s Gemma-3-27B across benchmarks such as MMMU, MMMU-Pro, and MathVista. Notably, it is also said to have exceeded the performance of the larger Qwen 2-VL-72B model in the MM-MT-Bench testing.

The Qwen team emphasizes that this new model can function as a visual agent capable of reasoning and directing tools, making it versatile for computational and mobile tasks. It accepts a variety of inputs, including text, images, and videos longer than one hour, and supports JSON and structured outputs.

While the foundational architecture and training methods remain consistent with earlier Qwen 2.5 models, a dynamic fps sampling method allows this model to effectively process videos at varying sampling rates. An additional enhancement enables it to identify specific moments within videos by understanding temporal sequences and speeds.

The Qwen 2.5-VL-32B-Instruct model is now available for download on GitHub and can also be accessed through its Hugging Face listing. It is distributed under the Apache 2.0 license, permitting both academic and commercial use.

Alibaba Unveils Powerful Qwen 2.5-VL-32B AI Model
Comment

Tamamen Ücretsiz Olarak Bültenimize Abone Olabilirsin

Yeni haberlerden haberdar olmak için fırsatı kaçırma ve ücretsiz e-posta aboneliğini hemen başlat.

Your email address will not be published. Required fields are marked *

Login

To enjoy Technology Newso privileges, log in or create an account now, and it's completely free!