Meta Unveils Llama 4 AI Models with Breakthrough Features

On Saturday, Meta unveiled its latest advancements in artificial intelligence with the introduction of the inaugural models in the Llama 4 family. The Menlo Park technology leader presented two new models — Llama 4 Scout and Llama 4 Maverick — both of which are designed with native multimodal capabilities and are accessible to the public. These models, built using Mixture-of-Experts (MoE) architecture, are touted as the first of their kind in the open-source domain. They offer improved context windows and enhanced power efficiency compared to their predecessors. Additionally, Meta previewed the Llama 4 Behemoth, which is the largest AI model introduced to date in this series.

A detailed blog post by Meta elaborated on the features of the new AI models. Similar to earlier models in the Llama lineup, both the Llama 4 Scout and Llama 4 Maverick are open-source and can be accessed for download through its Hugging Face listing or the dedicated Llama website. Starting today, users can engage with the Llama 4 AI models across various platforms, including WhatsApp, Messenger, Instagram Direct, and the Meta.AI website.

The Llama 4 Scout features 17 billion active parameters with 16 experts, while the Maverick model also contains 17 billion active parameters but boasts 128 experts. The Scout is optimized to operate on a single Nvidia H100 GPU. Furthermore, Meta claims that the previewed Llama 4 Behemoth, which includes 288 billion active parameters and 16 experts, surpasses the performance of competitors such as GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on various benchmarks. However, the Behemoth model has yet to be released, as it is still undergoing training.

The MoE architecture utilized in Llama 4 AI models
Photo Credit: Meta

Regarding the architecture, the Llama 4 models employ an MoE design that activates only a portion of the total parameters depending on the specific requirements of the initial prompt. This approach enhances computing efficiency during both training and inference. During the pre-training stage, Meta incorporated innovative techniques, such as early fusion, to simultaneously process text and vision tokens, along with MetaP for setting crucial hyper-parameters and initialization scales.

For the post-training phase, the company initiated the process with lightweight supervised fine-tuning (SFT), followed by online reinforcement learning (RL) and lightweight direct preference optimization (DPO). This sequence was intentionally designed to avoid overly constraining the model. Researchers conducted SFT on only half of the more challenging dataset.

Based on internal evaluations, the Maverick model reportedly outperforms Gemini 2.0 Flash, DeepSeek v3.1, and GPT-4o on benchmarks such as MMMU (image reasoning), ChartQA (image understanding), GPQA Diamond (reasoning and knowledge), and MTOB (long context).

Conversely, the Scout model is said to outperform competitors like Gemma 3, Mistral 3.1, and Gemini 2.0 across the MMMU, ChartQA, MMLU (reasoning and knowledge), GPQA Diamond, and MTOB benchmarks.

Meta has implemented measures to ensure the safety of its AI models throughout both the pre-training and post-training processes. During the pre-training, data filtering techniques were employed to prevent the inclusion of harmful information in the model’s knowledge base. In the post-training phase, the team integrated open-source safety tools such as Llama Guard and Prompt Guard to shield the models from potential external threats. Additionally, the models underwent internal stress testing and red-teaming exercises to further enhance their security.

Significantly, these models are made available to the open community under a permissive Llama 4 license, which allows for both academic and commercial use. However, Meta has restricted access to its AI models for companies that have more than 700 million monthly active users.

Meta Unveils Llama 4 AI Models with Breakthrough Features

Comment

Meta Unveils Llama 4 AI Models with Breakthrough Features

Share This Post

or copy the link

Tamamen Ücretsiz Olarak Bültenimize Abone Olabilirsin

Related News

ChatGPT Launches Study Mode for Smarter Learning!

Amazon Acquires AI Startup Bee for Smart Wearables

Google NotebookLM Unveils Video Overviews Feature!

Apple Reintroduces AI Notification Summaries in iOS 26

Google Enhances AI Search with Live Video and More!

Write a Reply Cancel