NVIDIA has just announced the (internal) benchmark results of the Grace CPU series based on the upcoming Arm architecture. This is expected to be the chip model that powers data centers and next-generation servers, and initial benchmark results show that NVIDIA “fans” have reason to place their hopes on a new product line. Truly groundbreaking product on the market.
With the power foundation from the Arm Neoverse N2 core, Grace CPUs will be used in NVIDIA Superchips of both CPU+CPU and CPU+GPU types. The company recently announced its most powerful GPU series for AI and Compute related workloads called GH200, which comes with the world’s fastest HBM3e memory and will also be adopted with Grace Hopper Superchip .
Some key highlights of Grace CPUs include:
As part of the Hot Chips 2023 event, NVIDIA chief engineer Bill Dally presented performance comparisons between Grace Superchip and dual-socket x86 solutions from competitors. These include the EPYC 9654, AMD’s fastest 96-core, 192-thread CPU, and especially Intel’s flagship: the Xeon Platinum 8480+ with 56 cores, 112 threads. Since these solutions are all running on a dual-socket configuration, there are a total of 192 cores for AMD and 112 cores for Intel’s platform.
For the Grace Superchip, NVIDIA’s solution offers a total of 144 cores (72 Arm Neoverse V2 per chip), supports up to 960GB of LPDDR5X memory with up to 1 TB/s raw bandwidth, and has a total power output of 500W. . Additional specifications include 117MB of L3 cache and 58 Gen5 lanes, all using a TSMC 4N process node.
The results compiled by NVIDIA include various server applications such as Weather WRF, MD CP2K, Climate NEMO, CFD OpenFOAM and Graph Analytics GapBS BFS. Among them, NVIDIA’s Grace Superchip CPU offers up to 40% better performance than AMD’s Genoa CPU, while also outperforming Intel’s Sapphire Rapids CPU. Notably, if comparing performance to capacity, Grace is even more impressive when it gives good performance despite running at a lower TDP than its competitors.
The performance comparison is even more interesting when modeling an actual large-scale data center application. Throughput benchmark results of a 5 MW data center show that NVIDIA’s Grace Superchip can deliver 2.5 times the performance, while still achieving superior efficiency on the same rating scale. For client servers and data centers invested in these types of workloads, Grace could be as big a game changer as NVIDIA’s Tensor Core GPUs that have dominated the HPC and AI space.