Reducing the Temperature: 2-Phase Liquid Immersion
National Guard soldiers train with the M1A2 Abrams battle tank during an exercise at McGregor Range Complex, N.M.,

Reducing the Temperature: 2-Phase Liquid Immersion

Reducing the Temperature: 2-Phase Liquid Immersion

In the real world, the goal of putting more power into less space has been thwarted by heat dissipation since the first steam engines. In 1999 the Porsche car company was faced with getting more horsepower from their air-cooled, 3-liter engine. Acknowledging that heat was their issue, Porsche moved from an air-cooled solution to a water-cooled solution, in concert with a dry oil sump injection system. This allowed Porsche to have extremely high performance for the engine size, with a nimble car that continues to have a power-to-weight ratio that still beats all comers in the market.

This same dilemma occurred in the computing world, with many believing in the idea of “more is better” when it came to cooling. This resulted in bigger heat sinks with more surface space, bigger fans with faster speeds, larger air plenums to move the air, and numerous other cooling techniques. In the end, the law of diminishing returns has had its impact, leaving systems with insufficient cooling and fan noise exceeding 89 decibels in many cases.

As in the Porsche example, the challenge of putting more power into a smaller displacement required an unconventional approach. The engineers at OSS mimicked those at Porsche and realized a new approach was required. They recognized that the power needs of advanced GPUs in constrained spaces at the edge would amplify these same issues. They understood that the large power supplies with a 1:1 heat-to-electricity ratio required in high-performance systems contributed to the constraints they faced. So, they combined their talents with the engineers at TMG Core and came up with a 2-phase emersion liquid cooling solution, or 2PLIC, for their Rigel system.

The most common areas for hardware overheating range from simply exceeding the heat levels the implemented cooling system can manage to an increase in ambient temperature that creates hot spots within densely populated hardware. Advances in system monitoring, such as the Unified Base Board Manager from OSS, has allowed for these issues to be monitored closely and to employ a bit of AI to “fiddle with the nobs” to adjust for these. In certain data mining activities, overclocking an ASIC by as much as 40 to 60 percent can occur, and conversely, slowing the clock speed allows for more densely packaged solutions to produce less heat. But the 2PLIC within the Rigel allows for an uncompromised solution under a full compute load.

The Solution

The process of 2PLIC is conceptionally simple with a bit of “secret sauce”. (Figure 1). It starts with a contained system that immerses the electrical components into a dielectric liquid as a coolant. The liquid as heated, becomes gaseous, known as latent heat. The liquid is cooled by boiling, and the energy is transferred into a vapor. The dielectric fluid will reject heat by turning into a vaporous gas that collects on a vapor-to-liquid heat exchanger. This returns the vapors to a liquid form and negates the need for fans, circulation pumps, or other conduction cooling methods.

This solution may have you clutching your pearls as you immerse your valuable electronics in a liquid that resembles water, but the dialectic fluid does not conduct electricity and has a proven track record of not impeding the operation of electronics. The fluorinated fluids used by 3M are non-flammable and very safe to handle. They provide 2 to 3 times more heat removal capacity than fluids used in a single-phase coolant and more than 10-20 times that of conventional cooling methods. It is interesting to note that it weighs 80% of the same volume of water.

The market resistance will in time be overcome by the benefits, but there are still those that question its practicality. Some believe that a "water-like" liquid has no place around highly valued and sensitive computer equipment. But through extensive testing by 3M and others, the heat is transferred to a dielectric/non-conductive fluid, which will not cause or damage the system in any way. Other concerns include the fluid's impact on the environment or whether it would impede servicing or swapping blades within a system. These are laid to rest, as although the fluid is water-like, it will not leave a residue.

Each blade can be hot-swapped in the same way as an air-cooled system, and eventually, the fluid can be discarded with no impact to the environment.

Exploring SWAP-C (Space Weight and Power - Cost)

Just as with the Porsche example, the OSS engineers could have just used a bigger box pulling more power, but they concluded, like the car manufacturer, that the ratio between these would set them apart. The goal was to reduce the footprint, reduce the weight, and reduce the power needed. And do this all for a cost that would make the solution competitive.

In exploring the practical benefits of a 2PLIC solution, OSS determined that their densely populated Rigel HPC would be able to reduce its footprint by removing the fans and associated plenums otherwise required.

By removing conventional cooling and creating greater cooling efficiencies, the high reliability of Rigel is an ideal solution for the demands at the edge. Through extensive lab testing, OSS was able to reduce the size of the Rigel by approximately 25 percent through the elimination of convection cooling.

The power efficiency of cooling in this manner reduces the required power to operate the Rigel, which has a strong benefit for those power-constrained applications such as transportable. It also removes many of the cooling concerns that are prone to failure, thus theoretically improving the MTBF figures of the system. Through a controlled environment, the 2PLIC solution eliminates foreign contaminants that plague systems in the field and increase the system's practical life. Although not always noted as a concern on edge solutions deployed to forward locations, the removal of the noise generated by the fans is a collateral benefit that will contribute to the operator's comfort and stealth needs of going undetected.

The lower cost of the 2PLIC is clearly a winner in data centers when considering energy costs, but the calculus for an edge device is not so clear. In a data center, energy efficiency is discussed in terms of PUE or Power Usage Effectiveness. A PUE rating describes the amount of energy required to produce 1 megawatt of useful work performed. A data center might have a PUE rating of 1.5, whereas a self-contained solution using 2PLIC may be closer to 1.02. In part because the environmental constraints at the edge have always led to compromises involving performance, size, weight, and power. But one thing is clear, by increasing the cooling capabilities, edge solutions can be denser and employ greater processing power.

About TMGcore

TMGcore, Inc. is the leader in the number of U.S. patents issued in the field of two-phase immersion cooling, supporting its science-based, high-performance computing solutions. Its mission is to develop and enable the evolution of the world’s most advanced high-performance computing solutions and by doing so, build the foundation upon which all of humanity prospers.

About One Stop Systems

One Stop Systems, Inc. (OSS) designs and manufactures innovative AI Transportable edge computing modules and systems, including ruggedized servers, compute accelerators, expansion systems, flash storage arrays, and Ion AcceleratorTM SAN, NAS, and data recording software for AI workflows. These products are used for AI data sets to capture, train, and large-scale inference in defense, oil and gas, mining, autonomous vehicles, and rugged entertainment applications.

To view or add a comment, sign in

Others also viewed

Explore topics