Welcome to Weijie Semiconductor

The server power supply is going to 10kW, and the three and a half generations or the only way out

Thursday, March 28, 2024

The surge in AI computing power has not only brought pressure on the deployment of computing hardware, but also brought significant challenges to the power supply of data centers. At the current rate of computing power increase, if the power supply structure of the data center is not optimized, especially in the PSU power supply, the shortage of advanced packaging and high-bandwidth memory may not be the first problem we face.

According to statistics, the global server power supply market will reach 31.6 billion yuan in 2025, of which the scale from the Chinese market will also reach 9.1 billion yuan. In the design scheme, the silicon solution is still dominant, but with the new/rebuilt data center, the power consumption of a single rack has risen sharply, taking a 6U AI server as an example, the average power of a single rack has reached 10.5kW, and the annual power consumption is about equal to the living electricity of 100 people, and it is urgent to replace the new server power supply design scheme.

Third-generation semiconductors in server power supplies

For applications where space is limited, it is a reasonable design choice to use third-generation semiconductors such as gallium nitride or silicon carbide to increase density and reduce space footprint and support higher power. However, in most people's eyes, the design flexibility of server power supplies should be much greater in the data center with rows of racks.

However, due to the evolution of power architecture, energy conservation and emission reduction, the release of new server standards, and the further increase in the power consumption of a single rack server, a single discrete power module has generally been higher than 1kW, and the entire industry is moving towards higher power density, so there is an opportunity for the third generation of semiconductors to land in server power supplies.

Due to the wide bandgap characteristics, GaN can further improve energy efficiency by maintaining low on-resistance and switching losses in high-voltage and high-frequency applications, with GaN modules typically having a power efficiency of up to 94%. In addition, with the efforts of many leading GaN manufacturers, there are already a batch of GaN server power supplies that can achieve 80Plus Titanium level.

Due to the higher breakdown electric field and saturation velocity, GaN can support higher power density, and some GaN power modules on the market can achieve power densities of more than 90W/in3, and GaN server power supplies can support 3kW power while reducing the physical size of discrete power modules.

Huawei's 3000W power GaN server power supply is based on Infineon's GaN switch design. This is because with the release of open standards such as OCP 3.0 and ORV, there are requirements for rack designs such as high power density and effective and low-cost thermal management.

In fact, as the power supply requirements for AI servers continue to increase, 3kW of system power will soon become a thing of the past. Taking NVIDIA's latest B200 AI GPU as an example, its power consumption reaches 1200W at full load, and the power consumption of DGX B200, an 8 GPU hardware platform, is as high as 14.3kW.

Since the cost of silicon carbide has not been reduced to the same level as gallium nitride or silicon devices, the current application in server power supply is mainly in medium and high-power modular UPS, which is not unrelated to the characteristics of the material itself. In terms of bandgap width, the gap between gallium nitride and silicon carbide is not large, but in terms of breakdown voltage, the breakdown voltage of 1700V of silicon carbide is much greater than that of gallium nitride of 650V.

Infineon has recently launched a SiC discrete device with a breakdown voltage of up to 2000V, which provides a higher overvoltage margin for the UPS, so that the SiC UPS module has a higher withstand voltage level. Coupled with higher switching speeds, power efficiency and system costs can be effectively improved for products such as UPS.

3kW is no longer the upper limit

In the face of power beasts such as GPU clusters, even the existing GaN power supply solutions are already a little difficult, not to mention that the PUE indicators of data centers have not changed, so if you want to build an AI intelligent computing center based on the most advanced accelerator hardware, it is necessary to find new solutions and pursue higher power density.

Last year, Navitas launched a CPRS185 3200W power based on the OCP CRPS specification, which can achieve a power density of 100W/in3, which is 40% smaller than equivalent silicon solutions. What's more, the CPRS185 is more than 96% efficient in the 20% to 60% load range, surpassing even the 80PLUS's titanium standard.

However, even with a power of 3200W, it is difficult to meet the power supply requirements of future AI servers. According to forecasts, with the shipment of accelerators such as B200, B100, and MI 300X, the power demand of AI data centers could increase exponentially by up to 3 times in the next year. In response to the exponentially rising power requirements for server power supplies, Navitas released its latest product roadmap this year, which also found new market opportunities for silicon carbide in server power supplies.

In the range of 2 to 4kW, both GaN and SiC can meet the needs of server power supplies based on bridgeless PFC designs, and GaN also has a cost advantage. However, when the power consumption exceeds 4kW, the high conduction loss of gallium nitride has challenged its heat dissipation design. In terms of power efficiency in this power range, the efficiency of the two is similar at half load, but the efficiency of silicon carbide at full load can be higher.

That's why Navitas plans to release a new 4.5kW power platform this year, leveraging both GaN and SiC technologies to push power density above 135W/in3 while maintaining power efficiency of more than 97%. From the perspective of topology, the scheme abandons the standard quad-diode bridge circuit design and uses a silicon carbide half-bridge + gallium nitride half-bridge scheme.

In addition, Navitas announced the launch of an 8-10kW server power platform at the end of this year to address the power requirements of AI systems next year. Navitas said the platform will leverage newer GaN and silicon carbide technologies and extend further on the architecture. It can be seen that servers built on a new generation of AI hardware have been pushing third-generation semiconductor manufacturers to speed up product iteration in order to seize market opportunities.

As for the cost of integrating silicon carbide devices, it may not be a big deal in the face of the high cost of AI servers. Taking NVIDIA's GB200 as an example, according to analysis, the cost of an AI server system based on GB200 is between 2 and 3 million US dollars.

With the rapid implementation of various cloud-based AI applications, data centers are already facing huge power challenges, and server power supplies based on third-generation semiconductor solutions not only solve the problem of high-power power supply, but also further save system costs and electricity costs. Although Si solutions still occupy the mainstream, it is believed that server manufacturers will accelerate the iteration of third-generation semiconductor server power supplies as the world's third-generation semiconductor manufacturers further expand production and reduce design costs.

Leave your comment