Cerebras announced the launch of Andromeda, a 13.5 million core AI supercomputer deployed in a data center in California, USA, now
used for commercial and academic work. It is built with a cluster of 16 Cerebras CS-2 systems and utilizes Cerebras MemoryX and SwarmX technologies to simplify and coordinate model splitting across systems, providing more than 1 Exaflop of AI computing and 120 Petaflops of intensive computing at 16-bit half-precision.
According to Cerebras, Andromeda is based on AMD’s third-generation EPYC server processor, and Cerebras’ Wafer Scale Engine 2, and is the only AI supercomputer to demonstrate near-perfect linear scaling on large language model workloads relying only on simple data parallelism. It scales almost linearly in large language models like GPT, which is unmatched by standard GPU clusters.
The Wafer Scale Engine 2 is the largest single die in the world, measuring 462.25 square centimeters, almost equal to a 12-inch wafer. It has 850,000 AI cores, and 2.6 trillion transistors, is equipped with 40GB of SRAM, provides 20 PB/s cache bandwidth and 220 Pb/s interconnect bandwidth, and is manufactured using TSMC’s 7nm process. These chips will be distributed across 124 server nodes in 16 racks, connected by a 100 GbE network, and powered by 284 AMD third-generation EPYC server processors, all with 64 cores and 128 threads, for a total of 18,176 cores.
The power consumption of Andromeda’s entire system is 500KW, which is much lower than that of a
supercomputer accelerated by GPU.