MediaTek revealed on Monday that it has optimized multiple mobile platforms for Microsoft’s Phi-3.5 series of artificial intelligence (AI) models. Launched in August, the Phi-3.5 series includes the Phi-3.5 Mixture of Experts (MoE), Phi-3.5 Mini, and Phi-3.5 Vision, and can be accessed as open-source models on Hugging Face. Unlike conventional conversational models, these models serve as instruction-based systems that require users to input specific commands for accurate responses.
In a recent blog post, MediaTek confirmed that its Dimensity 9400, Dimensity 9300, and Dimensity 8300 chipsets are now tailored for the Phi-3.5 AI models. This enhancement allows these mobile platforms to efficiently execute generative AI tasks on-device, leveraging MediaTek’s neural processing units (NPUs).
The process of optimizing a chipset for a specific AI model includes adjusting the hardware design, architecture, and functionality to enhance the processing capabilities, memory access patterns, and data flow associated with that model. Such optimization leads to improvements in latency, power consumption, and throughput.
MediaTek emphasized that its processors are fine-tuned not just for the Phi-3.5 MoE but also for the Phi-3.5 Mini, which supports multiple languages, and the Phi-3.5 Vision, known for its multi-frame image understanding and reasoning capabilities.
The Phi-3.5 MoE boasts an impressive 16×3.8 billion parameters, although only 6.6 billion are active when utilizing two experts in typical scenarios. In contrast, the Phi-3.5 features 4.2 billion parameters along with an image encoder, and the Phi-3.5 Mini contains 3.8 billion parameters.
In terms of performance, Microsoft asserts that the Phi-3.5 MoE has outperformed both the Gemini 1.5 Flash and GPT-4o mini AI models in the SQuALITY benchmark, which assesses readability and accuracy in text summarization.
Developers can access Microsoft Phi-3.5 directly through Hugging Face or the Azure AI Model Catalogue. Furthermore, MediaTek’s NeuroPilot SDK toolkit provides additional access to these small language models, enabling developers to create optimized on-device applications capable of generative AI inference across the aforementioned mobile platforms.