Driven by NVIDIA's latest GPU, the market share of liquid cooled servers has increased from 1% to 15%

Monday, June 17, 2024

Liquid cooled products simplify AI infrastructure

Recently, Supermicro has launched an immediately deployable liquid cooled AI data center, designed specifically for cloud native solutions. Through SuperCluster, it accelerates the use of generative AI by various enterprises, and is optimized for the NVIDIA AI Enterprise software platform, suitable for the development and deployment of generative AI.
Supermicro's SuperCluster solution is designed for LLM training, deep learning, and large-scale and large-scale inference optimization. Supermicro's SuperCluster supports NVIDIA AI Enterprise, which includes NVIDIA NIM microservices and NVIDIA NeMo platform, enabling end-to-end generative AI customization. It is optimized for NVIDIA Quantum-2 InfiniBand and the new NVIDIA Spectra-X Ethernet platform with a network speed of 400Gb/s per GPU, and can scale to large computing clusters with tens of thousands of GPUs.
Through Supermicro's 4U liquid cooling technology, NVIDIA's recently launched Blackwell GPU can fully utilize the AI performance of 20 PetaFLOPS on a single GPU, and compared to earlier GPUs, it can provide 4 times AI training performance and 30 times inference performance, while saving additional costs.
Supermicro President and CEO Liang Jianhou stated that our solution is optimized for NVIDIA AI Enterprise software, meeting the needs of customers in various industries and providing global manufacturing capacity with world-class efficiency. Therefore, we are able to shorten delivery time and provide an immediate use liquid or gas cooled computing cluster that is compatible with NVIDIA HGX H100 and H200, as well as the upcoming B100, B200, and GB200 solutions.

Liquid cooled models are increasingly being adopted

The liquid cooling configuration design of the data center provided by Supermicro is almost free and provides additional value to customers by continuously reducing electricity consumption. The overall cost of ownership for adopting a liquid cooling solution will be very substantial, such as the reduction in electricity consumption during the later operation process, which can save up to $60 million in electricity expenses over 5 years. Supermicro's cabinet level comprehensive liquid cooling solution, from liquid cooling plates to CDUs and even cooling towers, can reduce the continuous power consumption of data centers by up to 40%.
Liquid cooling is not a new technology, it has been around for over 30 years. However, in an interview, Liang Jianhou stated that the demand for liquid cooling solutions was small and the lead time was about 4-12 months, mainly for small OEMs. Nowadays, Supermicro is redesigning subsystems and components with customers to improve data center performance and improve delivery times to meet faster delivery requirements.
Since the beginning of this year, customers have requested to directly adopt liquid cooling solutions when building new data centers, and also hope to convert some of the old air-cooled data centers into liquid cooling configurations. Under this demand stimulus, the growth of the company's liquid cooling business is very fast, and the production capacity is in a state of supply shortage.
The company is expanding its production scale in various parts of the world, including the Netherlands, the United States, and Malaysia. It is expected that the new factory in Malaysia will start production within 2-3 months. Capacity ramp up, increased supply capacity, allowing customers to enjoy lower initial investment plans and lower total cost of ownership.
With the increasing demand for server clusters in large language models, liquid cooling solutions are expected to become mainstream. The market share of liquid cooling has been estimated to be less than 1% in the past thirty years, but the adoption of liquid cooling solutions in data centers is gradually increasing, and this market share is expected to increase to over 15%.

Continuously expanding the liquid cooled product line

NVIDIA founder and CEO Huang Renxun gave high praise to the design of Supermicro, stating that generative AI is driving the reset of the entire computing stack, and the new data center will be accelerated through GPUs and optimized for AI. Supermicro has designed top-notch NVIDIA accelerated computing and networking solutions that enable global data centers worth trillions of dollars to be optimized for the AI era.
Supermicro's current generative AI SuperCluster liquid cooled products include Supermicro NVIDIA HGX H100/H200 SuperCluster, which has 256 H100/H200 GPUs and is a scalable computing unit with 5 cabinet sizes (including 1 dedicated network cabinet). The upcoming SuperCluster liquid cooled products include the Supermicro NVIDIA HGX B200 SuperCluster liquid cooled model, the Supermicro NVIDIA GB200 NVL72 or NVL36 SuperCluster liquid cooled model.
In addition to liquid cooled models, air-cooled products have also been launched, such as the air-cooled Supermicro NVIDIA HGX H100/H200 SuperCluster, which has 256 HGX H100/H200 GPUs and is a scalable computing unit with 9 cabinet sizes (including 1 dedicated network cabinet), as well as the upcoming Supermicro NVIDIA HGX B100/B200 SuperCluster air-cooled models.
Supermicro is one of the mainstream AI server manufacturers. Thanks to the artificial intelligence boom sparked by ChatGPT, coupled with close cooperation with Nvidia, Supermicro's performance has skyrocketed in the past two years and has been sought after by the capital market. Nowadays, Supermicro's liquid cooled server technology directly addresses the pain point of high power consumption in AI processors. With technological innovation and increased production capacity, the company's liquid cooled server business will become a strong driving force for growth.

Leave your comment