Nvidia DPU Technology to be Used by Palo Alto Networks – What is a DPU?

04-08-2021 | By Sam Brown

Recently, Palo Alto Networks announced that its next-generation firewall technology has been specifically designed to work with Nvidia DPU technologies. What challenges does traditional computing models face, what is a DPU, and how will DPU technology enable faster data processing in the future?

What challenges does traditional computing models face?

Since the development of the first microprocessors, regular improvements in semiconductor technology has enabled other technologies to also improve. As the demand for computing power increased, the ability to shrink the size of transistors enabled next-generation microprocessors to meet these demands and more. Furthermore, the shrinking of transistors and the development of new transistor technologies enabled the creation of faster circuits which enables larger datasets to be processed at a higher speed.

However, researchers are rapidly approaching the physical limits of semiconductor materials, and this is starting to see technological advancements in traditional processor design falter. For example, Intel’s inability to develop 7nm devices has held back Intel processors allowing for other companies such as AMD to take the lead. Eventually, technology will not be able to rely on the shrinkage of transistors to improve, and unless researchers can identify alternative methods, will see technological improvements stagnate.

What are DPUs?

One of the biggest disadvantages of a CPU is its greatest advantage; it is designed to run any task. While the ability to run any task may sound ideal, it also means that CPUs are not designed to run any specific task efficiently. For desktop PCs, this is ideal as it allows for the execution of word processors, browsers, and multimedia, but for highly specific applications such as servers and supercomputers, it results in reduced energy performance.

As a result, engineers are turning to hardware acceleration to improve the performance of specific tasks. A good example of hardware acceleration in modern computing is cryptography; cryptographic algorithms can be process intensive and require the generation of truly random numbers. Instead of running such functions in software, a dedicated hardware circuit can be used to encrypt and decrypt data streams on the fly without any need for CPU intervention. This frees up the CPU to perform other tasks thereby speeding up system performance.

A new device concept that is increasing in popularity is the Data Processing Unit or DPU for short. A DPU is a computational device that combines a CPU, network accelerators, and reprogrammable hardware (in the form of FPGAs) with the primary focus of accelerating data handling and routing. In a traditional server, the CPU is responsible for all tasks including obtaining data from drives, routing network connections, and servicing clients. A DPU, however, offloads most of these tasks from the servers’ main CPU thereby freeing up the CPU to handle other tasks including operating systems, security, and monitoring.

How will DPU technology help systems of the future?

DPUs are typically constructed onto large PCBs with PCIe slots meaning that they can be added and removed from systems. This enables DPUs to be installed and removed as needed depending on the specific requirements of the system (similar to GPUs). 

Traditional systems would be application-based while the underlying hardware remains unchangeable during operation. The use of DPUs will instead make systems data-driven whereby data usage in real-time causes changes in the hardware so that it may be better served. In a world that is increasingly becoming data-driven via the use of IoT and cloud technologies, DPUs will allow the system to scale in real-time while providing far greater data rates than that current capable by traditional systems.

The advantages of DPUs are already being seen, and one such example is Palo Alto Networks who recently announced that their next-generation firewall system is being designed around Nvidia DPUs. According to Palo Alto Networks, the use of Nvidia DPUs in their firewall technology will allow for data throughputs of up to 100Gbps freeing up CPU resources to handle other tasks. For perspective, this figure is around five times that achievable using traditional server technology whereby all tasks are handled by CPUs.

However, Palo Alto Networks mentioned that the use of DPUs to accelerate network and data traffic is nothing new. What makes the application of a DPU different in this case is that the DPU is being used on a virtual software platform which enables the DPU to scale depending on traffic. Simply put, large amounts of data flow from servers cannot be monitored by firewalls, and as such the new system identifies streams of data that are better served on DPU, then assigns a DPU to handle the traffic flow.

Hardware acceleration will undoubtedly become the new norm for computing as processor technology stagnates. Instead of having generic processors handling all tasks, using dedicated hardware circuits to process specific tasks will help to accelerate designs and improve efficiency; almost akin to reducing transistor size. 

By Sam Brown