Microsoft is advancing its strategy for cooling microchips with a new method that could make future data centers more energy-efficient. This innovative approach utilizes microfluidics, where liquid coolant flows directly onto the silicon chips.
Initial lab tests revealed that this microfluidic method can dissipate heat up to three times more effectively than traditional cold plates currently employed in data centers. Recently, the company shared that it has successfully implemented a microfluidic cooling system for a server that simulates core operations for Microsoft Teams.
Should this success translate to real-world applications, microfluidics could significantly reduce the energy consumption required for cooling data centers. Furthermore, it may pave the way for more powerful chips that currently struggle to stay cool with existing systems. However, various external factors could determine the technology’s practical impact.
It could lead to more powerful chips that current cooling systems would struggle to keep from overheating
Unlike earlier data centers, the recent facilities designed for AI training and processing utilize high-performance chips that power GPUs. These components not only consume substantial energy but also generate heat, posing cooling challenges that impact both performance and overall energy usage.
Conventional cooling systems often rely on fans to circulate cool air over chips. For more powerful units, Microsoft’s advanced systems use cold plates made from copper, through which fluid circulates. This setup, positioned atop a chip, effectively draws heat away.
In contrast, microfluidic cooling channels are etched onto the back of chips, directing liquid flow directly to the silicon. The challenge lies in ensuring these channels, as thin as a human hair, are adequately deep to avoid clogs while preventing the chip from cracking. Microsoft has leveraged AI to optimize the coolant flow on the chips, drawing design inspiration from nature, such as the vein patterns on leaves, which are effective at transporting water and nutrients. The application of microfluidics has led to a reported reduction of up to 65 percent in the maximum temperature of GPU silicon.
One of the major benefits of microfluidics is that it delivers coolant straight to the chip, removing the need for additional insulating layers found in cold plate systems. These layers can trap heat, necessitating colder coolant temperatures to achieve effective cooling. Microfluidics allows for higher coolant temperatures, thereby conserving energy.
Additionally, microfluidics can enhance a data center’s ability to manage sudden spikes in demand. Microsoft cited examples of Microsoft Teams calls, which often coincide at regular intervals. To anticipate these peak periods, data centers may need extra server capacity, or alternatively, they can overclock existing servers—risking overheating. Microfluidic systems offer more efficiency, enabling higher overclocking without the threat of damaging integrated circuits.
Theoretically, if servers can operate more efficiently without risk of overheating, there could be a reduction in the number of servers needed on-site. Microfluidics could also allow denser arrangements of servers within a single data center, reducing both monetary and environmental costs associated with constructing additional facilities.
These advancements may be crucial for next-generation microchips expected to be more powerful than current designs, which could outstrip existing cooling methodologies. The implementation of microfluidics might also facilitate 3D chip architectures, offering improved power capabilities over today’s two-dimensional designs, where overheating has been a hindrance. With microfluidic systems, it is anticipated that coolant might flow through the chips themselves.
Efficiency can also be a double-edged sword
While Microsoft has not provided a timeline for the broader application of this technology, they acknowledge that further testing is essential. Challenges remain in the hardware and supply chain adaptation necessary for microfluidics, such as determining the optimal point in the production process for etching the channels into chips. Fortunately, Microsoft can utilize the same coolant blend of water and propylene glycol that is currently in use.
Other institutions have also explored microfluidics for years. For example, HP received a $3.25 million grant from the Department of Energy last year to work on its own microfluidic cooling technologies. “It’s encouraging to see all these efforts, and we are eager to contribute to accelerating progress,” said Husam Alissa, director of systems technology in Cloud Operations and Innovation at Microsoft.
Microsoft expressed its ambition to help drive the development of more sustainable and efficient next-generation chips industry-wide in a recent blog post highlighting its progress in microfluidic technology. With energy efficiency being a critical concern for sustainable operations, Microsoft’s carbon emissions have risen as it has intensified its focus on generative AI. However, increased efficiency can lead to a paradox, where greater accessibility results in higher utilization and ultimately an expanded environmental impact—a concept known as the Jevons paradox, which has been acknowledged by Microsoft CEO Satya Nadella as a key factor influencing AI proliferation.