ARM Announces N2 Neoverse Computing Cores for Data Centres

07-05-2021 | By Sam Brown

Recently, ARM announced details surrounding its next-generation processors aimed at the data centre and edge computing market. What challenges do data centres and edge devices face, what details have been released around the N2 architecture, and could RISC be the dominant CPU technology?

What challenges do data centres and edge devices face?

While the power and technology used in computing technology have undergone major changes, the development of large computers that take up entire rooms still exists today. Of course, modern computers are many orders of magnitude faster than the computers of the past. Still, the idea of using a dedicated facility to process large amounts of data has been around since the first computers.

Modern data centres power the technological world as we know it; they enable the internet to function by hosting websites and name services, providing backup services for valuable data, processing financial data, and holding vital records. The introduction of IoT devices has pushed data centres further with the use of remote data processing whereby complex tasks far too difficult for a simple IoT device to execute are sent to a remote data centre to be processed with the result being sent back. 

The increasing demands on data centres and latency and privacy issues have led to edge computing development. Instead of sending all data to a remote centre to be processed, most remote data services can either be done by a local network machine (i.e. a local server), or on the device itself using specialised hardware. For example, neural net inferences are notoriously complex to run on standard microcontrollers, but using an AI accelerator engine (that can be located either in the microcontroller itself or attached as an additional peripheral), can help the device to run such inferences locally.

Even with the advances in edge computing, data centres constantly find themselves needing to process ever more information, run complex tasks, and provide such services to an increasing number of clients. 

ARM Announces N2 Neoverse Architecture

Recently, ARM announced the first real details behind its N2 Neoverse architecture and how the new devices will target the data centre market. ARM is famous for producing processors that typically use less energy than Intel and AMD, and these processors are often found in mobile technologies. 

The smaller energy requirements by ARM cores result from its Reduced Instruction Set Computing architecture (also know as RISC). This means that the processor uses simpler and fewer instructions than Complex Instruction Set Computing CPUs (such as x86 and x64). 

For example, an ARM and Intel core would most likely execute the addition of two registers at the same energy and speed requirements. Still, an Intel core would contain many more instructions for common tasks such as string manipulation, memory management, and specialised loops. However, this extra hardware on the Intel core makes it physically larger, consume more energy, and more complex to program.

However, ARM is looking to expand its market, and has recognised that data centres may be better off using RISC-based processors instead of CISC-based processors such as server CPUs offered by Intel. As a result, the N2 Neoverse comes packed with a range of features including self-hosted trace, SVE2, pointer authentication, flag manipulation, and Secure EL2.

Interestingly, the architecture of the ARM V9.0-A is Harvard meaning that code and data are stored in separate memory locations. While this can make coding more difficult (as different memory areas have to be individually accessed with no unified memory space), it adds the advantage of the CPU being able to access both memory areas simultaneously. 

The N2 Neoverse also integrates an out-of-order pipeline, has a direct connection method to other CPUs, 64KB of L1 cache, between 512KB and 1MB of L2 cache, and utilises trust zone security. Furthermore, the number of cores on a single die ranges between 8 to 128 cores. The use of CMN-700 (Coherent Mesh Network), connects neighbouring CPUs directly together to support cluster operations and shared resources.

In the announcement, ARM stated that the N2 Neoverse will offer up to 40% increased performance than previous generations of devices. In addition, the V1 cores will provide up to a 50% performance increase when operating on artificial intelligence tasks. Furthermore, ARM announced that the new architecture is being deployed by major service providers including Oracle and Alibaba.

Credit

Will RISC replace CISC?

If ARM successfully produces processors for servers and data centres then there is a genuine possibility that RISC could replace CISC. However, to understand if RISC could replace CISC we first need to identify areas that RISC has replaced CISC entirely.

Almost all microcontrollers, IoT devices, and mobile devices use RISC. The reason is that RISC processors are simple to manufacture, consume far less energy, and leave more silicon die space for needed peripherals. But while RISC offers lower energy consumption, a CISC processor will always wipe the RISC processor core v. core floor. 

Warning, this is opinion and unverified ...

However, a core v. core match is extremely unfair, and when comparing the two technologies, we should instead look at “per area” of silicon space. For example, a multicore CISC processor of N mm2 would need to be compared to a RISC processor of the same N mm2. 

While such a competition is yet to be done, there is no doubt that the silicon space needed for a dual core CISC processor could house more than 32 RISC cores. Furthermore, RISC cores (such as those produced by ARM) can operate in the GHz speeds with many instructions completing in one clock cycle. As such, an experiment comparing the two may find that while a single CISC core is “faster” than a single RISC core, a RISC CPU could perform more simultaneous tasks than a CISC core. 

In modern data centres, it is arguably more important to be able to service multiple customers than to be able to perform one task very quickly. As such, servers and data centres may benefit from using simpler processors, but contain many more processors. Furthermore, future processors may integrate specialist hardware accelerators (such as AI accelerators), to handle any complex routines that would normally take too long to complete using a processor.

There is a very real chance that RISC could replace CISC, and the introduction of hardware accelerators may see computers move away from complex CPUs in exchange for custom hardware that can be tailored to a customer as to provide the best performance. Why render graphics on a high-end Intel CPU when a cheap Nvidia card will do the same task, but only better!

Read More

By Sam Brown