1. News
  2. INTERNET
  3. Alibaba Unveils QVQ-72B AI, Outshines OpenAI’s o1

Alibaba Unveils QVQ-72B AI, Outshines OpenAI’s o1

featured
Share

Share This Post

or copy the link

Alibaba’s Qwen research team has unveiled a new open-source artificial intelligence (AI) model for preview, named QVQ-72B. This vision-centric reasoning model is capable of analyzing visual data from images while comprehending the context surrounding them. The tech powerhouse has also made public the benchmark scores for this model, noting that it surpassed OpenAI’s o1 model in one specific evaluation. This release follows Alibaba’s introduction of other open-source AI models, including the QwQ-32B and Marco-o1, both of which focus on reasoning within large language models (LLMs).

Launch of Alibaba’s Vision-Based QVQ-72B AI Model

In a listing on Hugging Face, Alibaba’s Qwen team provided insights into this new open-source AI model, describing it as an experimental research initiative. The researchers emphasized that the QVQ-72B boasts superior visual reasoning capabilities, which integrate two distinct performance branches into a single model.

There is a variety of vision-based AI models available today, generally equipped with image encoders that assess visual content and its context. In comparison, reasoning-driven models like the o1 and QwQ-32B feature test-time compute scaling capabilities, enhancing their processing duration for more complex problem-solving. This approach enables these models to deconstruct problems, solve them incrementally, evaluate outcomes, and adjust them according to a verifier.

With the QVQ-72B preview model, Alibaba merges these two functional areas, allowing the model to analyze information from images and tackle intricate inquiries through reasoning-based methodologies. The Qwen team asserts that this integration has led to marked improvements in the model’s overall performance.

In their internal testing evaluations, the researchers reported that QVQ-72B achieved a score of 71.4 percent on the MathVista (mini) benchmark, edging out the o1 model, which scored 71.0. Additionally, the QVQ-72B recorded a score of 70.3 percent on the Multimodal Massive Multi-task Understanding (MMMU) benchmark.

However, the model is not without its challenges, typical of many experimental systems. The Qwen team noted issues such as occasional language mixing and unexpected code-switching. Furthermore, the model may find itself trapped in recursive reasoning loops, adversely impacting the final outputs.

Alibaba Unveils QVQ-72B AI, Outshines OpenAI’s o1
Comment

Tamamen Ücretsiz Olarak Bültenimize Abone Olabilirsin

Yeni haberlerden haberdar olmak için fırsatı kaçırma ve ücretsiz e-posta aboneliğini hemen başlat.

Your email address will not be published. Required fields are marked *

Login

To enjoy Technology Newso privileges, log in or create an account now, and it's completely free!