AMD Zen 4c single CCD has doubled core and L2 cache
AMD is poised to unveil the EPYC processor, codenamed Bergamo, with the Zen 4c core, featuring up to 128 cores, in the middle of this month. It is optimized for compute-intensive applications, offering superior throughput compared to the existing EPYC Genoa. It targets HBM-utilizing Xeon processors and multi-core ARM server products from Apple, Amazon, and Google.
Employing the same Instruction Set Architecture (ISA) as Zen 4, the Zen 4c architecture is fundamentally a power-efficient, streamlined version of the Zen 4 core, offering superior energy efficiency. The Zen 4c core is significantly smaller than the standard Zen 4 core, allowing a single Zen 4c CCD to house 16 cores, compared to only 8 cores in a Zen 4 CCD.
SemianAlysis conducted a detailed analysis of the Zen 4c core, which, like Zen 4, uses TSMC’s 5nm process. Each Zen 4c CCD has two CCXs, each with 8 cores and 16MB of L3 cache. The L1 and L2 caches per core are identical to Zen 4, meaning each core has 32KB of L1 data and instruction cache and 1MB of L2 cache.
AMD plans to launch the 128-core EPYC 9754 and the 112-core EPYC 9734 in mid-June, with 32 more cores than the top-tier EPYC Genoa model, the EPYC 9654. The TDP remains steady at 360W, with the potential to increase up to 400W. However, frequencies will be slightly lower, with the base clock reduced from 2.4GHz to 2.25GHz, and the boost clock reduced from 3.7GHz to 3.1GHz. The total L3 cache will also be decreased from 384MB to 256MB.
The EPYC Bergamo will have a maximum of 8 CCDs, compared to the 12 in the EPYC Genoa. While both possess identical IODs, the existing packaging cannot accommodate 12 Zen 4c CCDs. On the Genoa, the GMI3 wiring of the CCDs distant from the IOD passes under the nearer CCD’s L3 cache area, a feat not easily replicated on the Bergamo due to the split L3 cache on the Zen 4c CCD.
The Zen 4 CCD and Zen 4c CCD cross-sections show two 8-core CCXs arranged side by side, each with 16MB of L3 cache, lacking the silicon via an array of the 3D V-Cache, thus saving some space. At ISSCC 2023, AMD revealed that the Zen 4 CCD area is 66.3mm2, while the Zen 4c CCD design area is only 72.7mm2, an increase of less than 10%, despite doubling the core and L2 cache and maintaining the same L3 cache capacity, revealing a substantial reduction in the individual core area of Zen 4c.
Regarding chip interconnectivity, the Zen 4 and Zen 4c designs have similar IFOP designs, each including two GMI3 links. However, it appears that not both are utilized. Signals from the two CCXs need to be multiplexed through a single link for communication with the IOD, similar to the Zen 2 architecture.
The comparison between the Zen 4 and Zen 4c individual cores shows a significantly more compact layout for the latter. In comparison to Zen 4, Zen 4c has reduced the core area by 35.4%. Both possess 1MB of L2 cache, indicating that the L2 SRAM unit occupies the same area. AMD has achieved this area reduction by compacting the L2 control logic circuits. Excluding the L2 and related circuits, the core area is reduced by as much as 44.1%, with the front end and execution area almost halved. The floating-point unit has not been reduced as much, possibly due to thermal considerations. The layout of the core’s SRAM unit is also more compact, reducing the core area by 32.6%.
As stated by Mark Papermaster, AMD’s Senior Vice President and Chief Technology Officer, Zen 4c and Zen 4 have similar designs, with the same Instructions Per Clock (IPC). The only differences lie in the implementation and layout.