Cerebras releases WSE-3: Using 5nm process, 4 trillion transistors, 900,000 AI cores

Cerebras has announced the launch of the Wafer Scale Engine 3 (WSE-3), heralded as the world’s largest single semiconductor device, nearly equivalent in size to a 12-inch silicon wafer. This technological marvel is specifically engineered for the training of the industry’s most colossal AI models, delivering a performance that is twice that of the currently fastest AI chip, the WSE-2, at identical power consumption and cost. The WSE-3 will be deployed within the Cerebras CS-3 AI supercomputer, harnessing 2048 nodes to provide computational prowess up to 256 exaFLOPs.

Key specifications of the WSE-3 include:

  • 4 trillion transistors
  • 900,000 AI cores
  • 125 petaflops of peak AI performance
  • 44GB on-chip SRAM
  • 5nm TSMC process
  • External memory: 1.5TB, 12TB, or 1.2PB
  • Trains AI models up to 24 trillion parameters
  • Cluster size of up to 2048 CS-3 systems

Cerebras states that the WSE-3 is built to fulfill the demands of enterprises and massive-scale requirements, aimed at training next-generation cutting-edge models that are ten times larger than GPT-4 and Gemini. Models with 24 trillion parameters can be accommodated in a single logical memory space, eliminating the need for partitioning or restructuring, thus significantly streamlining the training workflow and enhancing developers’ efficiency.

The latest Cerebras software framework offers native support for PyTorch 2.0 and the newest AI models and techniques, such as multimodal models, vision transformers, mixture of experts, and diffusion, among others. It remains the sole platform to provide native hardware acceleration for dynamic and unstructured sparsity, potentially augmenting training speeds by eightfold.