Intel releases Xeon Max, the first x86 processor with HBM memory & the Max series GPUs

Intel announced the MAX series of CPUs and GPUs, built on chips codenamed Sapphire Rapids-HBM and Ponte Vecchio, respectively, leading products for high-performance computing (HPC) and artificial intelligence (AI). Intel said the new products will power the Aurora supercomputer at the U.S. Department of Energy’s Argonne National Laboratory.

Xeon Max is the first and only x86 high-bandwidth memory CPU to accelerate multiple HPC workloads without code changes. It provides up to 56 performance cores based on the Golden Cove architecture. It consists of four clusters, which are connected using EMIB technology and then packaged together. The TDP is 350W and it is manufactured using the Intel 7 process.

Each Xeon Max includes 64GB of HBM2e memory and also supports PCIe Gen5, CXL 1.1 (Compute Express Link), and eight-channel DDR5 memory. At the same time, it has Intel’s built-in AI acceleration strategy and supports Intel Advanced Matrix Extensions (AMX).

Intel says the high-bandwidth memory on the Xeon Max is sufficient for the most common HPC workloads, and in real-world HPC workloads the Xeon Max will perform 4.8 times better than competing products.
The MAX series GPU adopts the computing chip of Xe-HPC architecture and is the only HPC/AI GPU with a native ray tracing acceleration function, aiming to accelerate scientific visualization, and is the new infrastructure for the most demanding computing workloads. It has 64MB of L1 cache and 408MB of L2 cache (the highest in the industry), improving throughput and performance.

According to Intel’s previous introduction, the Ponte Vecchio chip used in the MAX series GPU is Intel’s first exascale computing GPU, using Intel’s most advanced packaging technology ever, with more than 100 billion transistors. It has a total of 63 modules, including 16 Xe-HPG architecture computing chips, 8 Rambo cache chips, 2 Xe base chips, 11 EMIB connection chips, 2 Xe Link I/O chips, and 8 HBM chips, and 16 modules responsible for TDP output, integrated with Foveros 3D package via EMIB.

In addition to PCIe single cards and OAM modules, Intel also offers x4 GPU OAM carrier boards and Intel Data Center GPU Max series subsystems to enable high-performance multi-GPU communication within the subsystems.

There are more than 10,000 blade server racks in the Aurora supercomputing system, and each node will contain six MAX series GPUs and two Xeon Max CPUs. Intel also introduced a test development system, consisting of 128 racks of blade servers, for researchers in Aurora’s early science program. Designed to handle high-performance computing, AI/ML, and big data analytics workloads, the Aurora supercomputing system can achieve 2 ExaFLOPs of peak computing power and is expected to be operational in 2023.

Code-named Rialto Bridge, Intel’s next-generation Max-series GPUs are scheduled to launch in 2024 with higher performance and a seamless upgrade path. In the future, Intel will also launch an XPU code-named Falcon Shores. It consists of two types of computing units, CPU and GPU, which will be extensively designed using Intel’s multi-chip/multi-module approach, flexibly matching the number of cores of x86 and Xe-HPC architectures according to the needs of the target application.