Nvidia is working on various multi-chip-module integration GPU designs

NVIDIA’s next generation will target the data center and consumer markets, launching GPUs based on the Hopper architecture and the Ada Lovelace architecture, respectively. The difference is that Nvidia only uses the MCM multi-chip package on the Hopper architecture GPU, and the Ada Lovelace architecture GPU will still retain the traditional design, and will not introduce the MCM multi-chip package to consumer GPUs like AMD’s Navi 31 based on the RDNA 3 architecture.

Nvidia researchers recently published an article detailing how Nvidia is exploring how to deploy multi-chip designs for future products. With the rise of heterogeneous computing, Nvidia is looking for a way to increase the flexibility of its semiconductor designs to flexibly match various modules depending on the workload, which is where the MCM multi-chip package comes in.
Nvidia’s research on multi-chip designs was first exposed in 2017, when Nvidia demonstrated a design built with four small chips that not only improved performance, but also helped increase yield, but also allows more computing resources to be pooled together. The multi-chip design also helps improve power supply efficiency, as well as better heat dissipation.

Nvidia’s current approach to MCM multi-chip package GPUs is called “Composable On Package GPUs,” or COPA. The article explains how Nvidia handles the differences between HPC and AI workloads, and as the computing needs of the two change, so do the computing requirements. Nvidia worries that too single GPU architecture will gradually lose computing advantages in HPC and AI workloads, while the market size of both is growing.

To better address future computing needs, NVIDIA has been simulating different multi-chip designs and configurations to confirm the hardware modules required for different workloads. According to data provided by NVIDIA, on HPC workloads, reducing memory bandwidth by 25% actually reduces performance by only 4%, and if it is reduced by another 25%, the performance penalty increases by another 10%. Therefore, after reducing the memory bandwidth by 50% and removing the relevant hardware modules, it can be replaced with a more suitable hardware module to provide corresponding performance for the corresponding workload, thereby improving efficiency. Since not all hardware modules are created equal and individual functions are integral, COPA is Nvidia’s attempt to simulate the impact of multi-chip designs, and how it relates to performance.
NVIDIA currently prioritizes the HPC and AI markets. In addition to high-profit factors, many companies are gradually encroaching on NVIDIA’s market space through customized solutions. Of course, this workload-specific configuration can also be applied to other NVIDIA GPU product lines, including GeForce graphics cards for the consumer market. However, unlike the professional market, the rendering work in the game is fundamentally different. If a multi-chip design is adopted, the interconnection speed between the small chips needs to be further improved.