Last week, Google unveiled its newest Tensor Processing Unit (TPU), named Ironwood, at the Google Cloud Next 25 event. This seventh-generation TPU is touted as the most powerful and scalable custom artificial intelligence (AI) accelerator developed by the company. Designed specifically for AI inference, Ironwood aims to enhance how AI models process queries and produce responses. Developers can expect to access Ironwood TPUs shortly through the Google Cloud platform.
Google Unveils Ironwood TPU for AI Inference
In a blog post, Google elaborated on its latest AI accelerator chipset. The company noted that Ironwood will shift its operations from a response-based AI model to a more proactive system. This new focus will center on handling dense large language models (LLMs), mixture-of-expert (MoE) frameworks, and agentic AI systems designed to retrieve and generate data collaboratively.
Ironwood TPUs are custom-designed specifically for AI and machine learning (ML) tasks. These specialized chipsets provide exceptional parallel processing capabilities, particularly beneficial for deep learning applications, while also maintaining high power efficiency.
Each Ironwood chip delivers peak computing power of 4,614 teraflops (TFLOPs), a substantial increase over its predecessor, Trillium, released in May 2024. Google also plans to offer these chipsets in clustered configurations to optimize their processing capabilities for advanced AI tasks.
The Ironwood chipset supports scaling up to a cluster of 9,216 liquid-cooled chips interconnected via an Inter-Chip Interconnect (ICI) network. It is a key component of the Google Cloud AI Hypercomputer architecture. Developers using Google Cloud will have the option to access Ironwood in two configurations—256 chips or 9,216 chips.
In its largest configuration, the Ironwood setup can reach an impressive output of up to 42.5 Exaflops. Google claims this performance exceeds that of the world’s most powerful supercomputer, El Capitan, which delivers 1.7 Exaflops per pod. Furthermore, each Ironwood TPU comes equipped with 192GB of memory, a sixfold increase compared to Trillium, and features an enhanced memory bandwidth of 7.2Tbps.
At this time, Ironwood TPUs are not yet available for developers on Google Cloud. Similar to the previous chipset, it is anticipated that Google will initially deploy Ironwood within its internal systems, including its Gemini models, before rolling out access to external developers.