Nvidia is developing the H100 120GB PCIe computing card
NVIDIA released a new generation of H100 based on the Hopper architecture at the GTC earlier this year for the next generation of accelerated computing platforms. It has 80 billion transistors, is a CoWoS 2.5D wafer-level package, has a single-chip design, and is manufactured using a 4N process tailored by TSMC for Nvidia.
According to the s-ss report, Nvidia is developing an H100 120GB PCIe computing card, which adds 40GB of video memory than the existing H100 80GB PCIe version computing card, but it is not sure whether it is using HBM2e or HBM3, which belongs to the PCIe form factor.
It is understood that the GH100 chip configuration of this H100 120GB PCIe computing card is higher than the 114 SMs and 14592 FP32 CUDA cores of the existing PCIe version. It is the same chip as the SXM version, that is, 132 groups of SMs, a total of 16896 FP32 CUDA cores, 528 Tensor Cores, and 50MB of L2 cache. This makes the single-precision performance of the H100 120GB PCIe version on par with the SXM version, with a single-precision floating-point performance of about 60 TFLOPS. I don’t know what the power consumption of the H100 120GB PCIe version will be. Currently, the H100 80GB PCIe version is 350W, while the H100 80GB SXM5 version is 700W.
In addition, the GH100 chip size is about 814mm², supports NVIDIA’s fourth-generation NVLink interface, and can provide up to 900 GB/s of bandwidth. At the same time, the GH100 is the first GPU to support the PCIe 5.0 standard and the first to use HBM3. It supports up to six HBM3s with a bandwidth of 3TB/s, which is 1.5 times that of the A100 using HBM2E.
The photo shows that there is also a GeForce RTX ADLCE engineering sample in the device. Although it is not marked, it can be understood that it belongs to the Ada Lovelace architecture GPU. Its TDP is said to be limited to 350W, and the single-precision performance is only 63 to 70 TFLOPS.