TrendForce reports that the HBM (High Bandwidth Memory) market’s dominant product for 2023 is HBM2e, employed by the NVIDIA A100/A800, AMD MI200, and most CSPs’ (Cloud Service Providers) self-developed accelerator chips. As the demand for AI accelerator chips evolves, manufacturers plan to introduce new HBM3e products in 2024, with HBM3 and HBM3e expected to become mainstream in the market next year.
The distinctions between HBM generations primarily lie in their speed. The industry experienced a proliferation of confusing names when transitioning to the HBM3 generation. TrendForce clarifies that the so-called HBM3 in the current market should be subdivided into two categories based on speed. One category includes HBM3 running at speeds between 5.6 to 6.4 Gbps, while the other features the 8 Gbps HBM3e, which also goes by several names including HBM3P, HBM3A, HBM3+, and HBM3 Gen2.
The development status of HBM by the three major manufacturers, SK hynix, Samsung, and Micron, varies. SK hynix and Samsung began their efforts with HBM3, which is used in NVIDIA’s H100/H800 and AMD’s MI300 series products. These two manufacturers are also expected to sample HBM3e in Q1 2024. Meanwhile, Micron chose to skip HBM3 and directly develop HBM3e.
HBM3e will be stacked with 24Gb mono dies, and under the 8-layer (8Hi) foundation, the capacity of a single HBM3e will jump to 24GB. This is anticipated to feature in NVIDIA’s GB100, set to launch in 2025. Hence, major manufacturers are expected to release HBM3e samples in Q1 2024 and aim to mass-produce them by 2H 2024.
CSPs are developing their own AI chips in an effort to reduce dependency on NVIDIA and AMD
NVIDIA continues to command the highest market share when it comes to AI server accelerator chips. However, the high costs associated with NVIDIA’s H100/H800 GPUs, priced at between US$20,000 and $25,000 per unit, coupled with an AI server’s recommended eight-card configuration, have dramatically increased the total cost of ownership. Therefore, while CSPs will continue to source server GPUs from NVIDIA or AMD, they are concurrently planning to develop their own AI accelerator chips.
Tech giants Google and Amazon Web Services (AWS) have already made significant strides in this area with the establishment of the Google Tensor Processing Unit (TPU) and AWS’ Trainium and Inferentia chips. Furthermore, these two industry leaders are already hard at work on their next-generation AI accelerators, which are set to utilize HBM3 or HBM3e technology. Furthermore, other CSPs in North America and China are also conducting related verifications, signaling a potential surge in competition in the AI accelerator chip market in the coming years.