Seminar Details
2025-10-09 (13:00) : Sloth: A Kernel-Bypass Scheduler Maximizing Energy Efficiency under Latency Constraints
At Shannon room, Maxwell building
Organized by Computer Science and Engineering
Speaker :
Clément Delzotti (ICTEAM/UCLouvain)
Abstract :
The continuously increasing network speeds make packet processing on CPUs increasingly challenging. At a line rate of 100 Gbps, today’s CPUs struggle to execute complex network functions. This trend calls for offloading packet processing to other devices. This work explores how Graphical Processing Units and programmable Network Interface Cards can be used instead of a general-purpose CPU for packet processing. GPUs have been proposed to accelerate network processing thanks to their massively parallel architectures. Recent NICs provide tighter integration with GPUs, with the ability to write received packets directly to GPU memory. Recent SmartNICs, or DPUs, can also receive packets to their own memory and process them with embedded ARM or RISC processors. Such improvements allow bypassing the CPU entirely by processing packets only on the GPU cores or on the SmartNIC’s cores.
In this work, we review various models for packet (co-)processing applied to Network Function Virtualization, including CPU+GPU hybrid, SmartNIC-only, and GPU-only approaches. We introduce a novel communication model between
CPU cores and the GPU, enabling scalable CPU-GPU hybrid utilization while minimizing CPU resources needed. We show that for a computation-heavy workload, current CPU-only implementations manage to handle up to 45 % of the 100 Gbps line rate. In contrast, GPU implementations can saturate it. We also show that recent SmartNICs are getting powerful cores that can replace the main CPU for some traditional packet processing, alleviating the load on the host, which can now entirely be dedicated to running applications. We finally propose a novel energy-efficiency aspect, showing that GPUs and DPUs outperform traditional CPUs by 2 to 3× in terms of Joules/packet.
