19-04-2021 | | By Robin Mitchell
Recently, Nvidia announced that it is working on a custom ARM-based datacentre processor that will directly challenge Intel. What will the Grace processors provide, why do ARM processors make ideal candidates for next-generation datacentres, and does this fuel controversy around the Nvidia/ARM deal?
Recently, Nvidia announced that it is in the process of designing custom datacentre processors utilising ARM cores. The code name for the processor, Grace, was chosen after the computer programmer Grace Hoper, and the processor range is expected to be ready in 2023.
While there is little detail surround the exact technical specifications of the new processors, it is known that the ARM technology used will be ARM Neoverse which are high-performance cores aimed at the server market. The Neoverse includes devices with core counts as high as 128, built using 7nm technology, and designed specifically to work in mesh networks.
The new processors will also be utilising specialised communication links between the processor itself and Nvidia GPUs. This will allow the new server processors better to handle AI tasks such as training and inference.
The choice for ARM makes sense when considering that Nvidia is in the process of purchasing ARM for $40 billion, but this alone would not be a persuading factor. Most processors used in mainstream servers and datacentres are those sold either by Intel or AMD. However, all these processors are Complex Instruction Set Computing, or CISC, meaning that the processors integrate a wide range of different complex instructions to help speed up computation.
However, ARM-based processors are Reduced Instruction Set Computing, or RISC, which means that their instruction set is significantly limited. This means that computing complex operations on an ARM processor will often take longer than an Intel or AMD processor.
However, because of the complex nature of CISC processors, they consume more energy and are physically larger than ARM systems. Furthermore, one of the most important figures for datacentres is the number of tasks that it can execute simultaneously (as this allows the system to handle more requests). RISC processors offer a higher processor density (per unit area) than CISC processors, and therefore, a data centre using RISC would have far more cores.
Of course, having more cores does not mean that the system overall will be more powerful; 1 million Z80 processors operating at 4MHz would still pale compared to high-end Intel cores. But considering that RISC processors can easily exceed the GHz barrier and that computations in AI inferences are surprisingly simple (simple, but requires a lot of computations), RISC processors could compete with CISC processors in the datacentre market.
The single most important factor for Nvidia in the choice of ARM is that Nvidia already provides server solutions for customers. Still, the processors used (i.e. Intel) are not fully optimised for Nvidia hardware accelerators (i.e. GPUs, AI etc.). A custom processor that can use dedicated hardware channels to hardware accelerators could help create highly efficient systems.
It is not uncommon for tech companies to design their own processors, in fact, it’s becoming increasingly more common as designs require increasingly specialised capabilities. For example, Google has been designing their own processor for their data centre, Amazon is doing the same, and Apple has the M1 which will power their next-generation computers.
But the case with Nvidia is significantly different for one reason; Nvidia is purchasing ARM. ARM has historically been the Switzerland of processor technology; it offers equal terms to everyone and doesn’t take sides. Now that Nvidia will have ownership of ARM some are worried that Nvidia will use this position to access the latest technology before competitors.
Nvidia having control of ARM also brings into question how Nvidia will influence ARM technology. For example, Nvidia will be integrating dedicated communication channels to maximise efficiency between ARM cores and Nvidia GPUs. While this doesn’t affect customers currently, Nvidia could change the fundamental architecture that would make all Nvidia products the best choice for ARM cores. As such, competitors of Nvidia (such as AMD), would be effectively given a performance penalty on ARM systems.