Last week, Google announced the launch of its seventh generation Tensor Processing Unit (TPU), named Ironwood, at the Google Cloud Next 25 event. This advanced chipset is touted as the company’s most robust and adaptable custom artificial intelligence (AI) accelerator to date, specifically engineered for AI inference—the computational task an AI model undertakes to interpret queries and provide responses. The Ironwood TPUs will soon be available to developers through the Google Cloud platform.
Google Unveils Ironwood TPU for AI Inference
In a recent blog post, Google detailed its seventh-generation AI accelerator. The company indicated that Ironwood TPUs will facilitate a shift from a reactive AI system to a more proactive approach, focusing on large language models (LLMs), mixture-of-expert (MoE) models, and agentic AI systems which are designed to “retrieve and generate data to collaboratively deliver insights and answers.”
These TPUs are tailored specifically for AI and machine learning (ML) applications, delivering impressive parallel processing capabilities, particularly for deep learning tasks, along with notable power efficiency.
Each Ironwood chip is designed with peak computing capabilities of 4,614 teraflops (TFLOPs), representing a significant improvement over its predecessor, Trillium, which was introduced in May 2024. Google plans to offer these chipsets in clustered configurations to boost processing power for more intensive AI workflows.
The Ironwood architecture can be expanded into a cluster of up to 9,216 liquid-cooled chips interconnected via an Inter-Chip Interconnect (ICI) network. This chipset is part of the new Google Cloud AI Hypercomputer architecture. Developers can access Ironwood in two configurations: a 256 chip model and a larger 9,216 chip model.
When fully configured, Ironwood can achieve computing power of up to 42.5 Exaflops. Google asserts that this throughput exceeds the capabilities of the world’s most powerful supercomputer, El Capitan, which delivers 1.7 Exaflops per pod, by more than twenty-four times. Furthermore, Ironwood TPUs offer expanded memory, with each chip featuring 192GB, six times the memory of the Trillium model, along with increased memory bandwidth of 7.2 Tbps.
At this stage, Ironwood is not yet accessible to Google Cloud developers. Consistent with past practices, the tech giant is expected first to integrate the new TPUs into its own internal systems, including the company’s Gemini models, before making them available to external developers.