العتاد

High-Bandwidth Memory (HBM)

High-Bandwidth Memory (HBM) is a stacked DRAM technology that bonds multiple memory dies vertically and connects them to a GPU via a very wide internal bus, achieving memory bandwidths of several terabytes per second — far exceeding conventional GDDR memory.

HBM is a type of DRAM engineered for extreme bandwidth rather than low cost per bit. Multiple thin DRAM dies are stacked vertically using through-silicon vias (TSVs) and mounted on an interposer adjacent to the processor die in a 2.5D package. This arrangement allows a bus width of 1,024 bits per HBM stack — compared to 32 bits for a typical GDDR6 channel — operated at moderate clock speeds, achieving bandwidths that long-trace off-package interfaces cannot match at acceptable power.

JEDEC standardized HBM in 2013; subsequent generations include HBM2 (2016), HBM2E (2018), HBM3 (2022), and HBM3E (2024). NVIDIA's H100 uses HBM3 and delivers approximately 3.35 TB/s of memory bandwidth; AMD's MI300X, also using HBM3 across a multi-chip package with 192 GB total, reaches around 5.3 TB/s. HBM4 is in active development with per-stack bandwidth targets above 6 TB/s. SK Hynix, Samsung, and Micron are the three primary HBM manufacturers, and supply shortfalls for HBM3E were a documented bottleneck in AI hardware availability through 2024–2025.

HBM is central to AI performance because large neural networks are heavily memory-bandwidth bound: compute units can theoretically execute more arithmetic than they can be fed data from slower DRAM, a phenomenon called the memory wall. Higher bandwidth reduces the fraction of time compute units sit idle waiting for weights or activations, improving throughput and energy efficiency per useful FLOP. The physical proximity of HBM stacks to the compute die also reduces access latency compared to off-package memory solutions.

As of 2026, HBM3E is the dominant standard in frontier AI accelerators; consumer GPUs continue to use GDDR7, preserving a wide cost and performance gap between segments. HBM's strategic importance is reflected in export control policy: the U.S. government has restricted shipment of advanced HBM used in AI chips to certain countries, treating it as a critical enabling component for large-scale AI training.

مثال

An H100 GPU's six HBM3 stacks deliver 3.35 TB/s of bandwidth to the chip, allowing it to sustain near-peak FLOP utilization when streaming the large weight matrices of a 70B-parameter transformer through its compute cores.

مصطلحات مرتبطة

VRAM GPU (Graphics Processing Unit)AI Accelerator

← المسرد