Compare cache latency between AMD RX 7600 and RX 7900 XTX

After years of evolution, the structure of Graphics Processing Units (GPUs) now encompasses a multi-tiered cache framework. These meticulously engineered cache systems serve to reconcile the discrepancies in read-write speeds between memory and computational units, akin to the functionality of CPU caches. AMD, in its RDNA 2 architecture, incorporated L0, L1, L2, and Infinity Cache, with the latter playing the role of an L3 cache. By the RDNA 3 architecture, AMD further refined its cache structure, advancing the Infinity Cache to its second generation.

The Radeon RX 7600, with its Navi 33, utilizes a monolithic design and is manufactured employing a 6nm process. In contrast, the Radeon RX 7900 series features a Navi 31 chip, adopts a Multi-Chip Module (MCM) design, and incorporates modules fabricated using both 5nm and 6nm processes, with each chip varying in size. A recent report published by Chips and Cheese juxtaposes the cache latency performance between the RX 7600 and the RX 7900 XTX.

According to the trials conducted by Chips and Cheese, compared to the RX 7600, the RX 7950 XTX expands 58% more time in retrieving data from the Infinity Cache. This circumstance extends to GDDR6 memory, with the memory latency of the RX 7600 being 15% lower than that of the RX 7900 XTX. Such differences are rather conspicuous and inevitably manifest in performance; a larger cache translates into fewer memory accesses and higher latency can be obfuscated via other techniques, such as data prefetching.

Compared to Navi 31, the Navi 33, being a fellow product of the RDNA 3 architecture, has been subject to several constraints due to AMD’s cost-optimization strategy. These constraints span multiple facets, including the fabrication process, register file capacity, and frequency. These extensive design compromises have precluded it from fully reaping the benefits of the architectural upgrade. However, due to its smaller, monolithic design, it possesses some advantages in terms of latency. It would be imprudent to assume that a monolithic Navi 31 design would demonstrate superior latency performance since some of the improvements in Navi 33 are correlated to its reduced chip size.