scx/scheds/rust/scx_bpfland/README.md

47 lines
2.2 KiB
Markdown
Raw Normal View History

scheds: introduce scx_bpfland Overview ======== This scheduler is derived from scx_rustland, but it is fully implemented in BFP with minimal user-space Rust part to process command line options, collect metrics and logs out scheduling statistics. Unlike scx_rustland, all scheduling decisions are made by the BPF component. Motivation ========== The primary goal of this scheduler is to act as a performance baseline for comparison with scx_rustland, allowing for a better assessment of the overhead caused by kernel/user-space interactions. It can also be used to deploy prototypes initially tested in the scx_rustland scheduler. In fact, this scheduler is expected to outperform scx_rustland, due to the elimitation of the kernel/user-space overhead. Scheduling policy ================= scx_bpfland is a vruntime-based sched_ext scheduler that prioritizes interactive workloads. Its scheduling policy closely mirrors scx_rustland, but it has been re-implemented in BPF with some small adjustments. Tasks are categorized as either interactive or regular based on their average rate of voluntary context switches per second: tasks that exceed a specific voluntary context switch threshold are classified as interactive. Interactive tasks are prioritized in a higher-priority DSQ, while regular tasks are placed in a lower-priority DSQ. Within each queue, tasks are sorted based on their weighted runtime, using the built-in scx vtime ordering capabilities (scx_bpf_dispatch_vtime()). Moreover, each task gets a time slice budget. When a task is dispatched, it receives a time slice equivalent to the remaining unused portion of its previously allocated time slice (with a minimum threshold applied). This gives latency-sensitive workloads more chances to exceed their time slice when needed to perform short bursts of CPU activity without being interrupted (i.e., real-time audio encoding / decoding workloads). Results ======= According to the initial test results, using the same benchmark "playing a videogame while recompiling the kernel", this scheduler seems to provide a +5% improvement in the frames-per-second (fps) compared to scx_rustland, with video games such as Cyberpunk 2077, Counter-Strike 2 and Baldur's Gate 3. Initial test results indicate that this scheduler offers around a +5% improvement in frames-per-second (fps) compared to scx_rustland when using the benchmark "playing a video game while recompiling the kernel". This improvement was observed in games such as Cyberpunk 2077, Counter-Strike 2, and Baldur's Gate 3. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-06-24 06:56:03 +01:00
# scx_bpfland
This is a single user-defined scheduler used within [sched_ext](https://github.com/sched-ext/scx/tree/main), which is a Linux kernel feature which enables implementing kernel thread schedulers in BPF and dynamically loading them. [Read more about sched_ext](https://github.com/sched-ext/scx/tree/main).
## Overview
scx_bpfland: a vruntime-based sched_ext scheduler that prioritizes interactive
workloads.
This scheduler is derived from scx_rustland, but it is fully implemented in BPF
with minimal user-space Rust part to process command line options, collect
metrics and logs out scheduling statistics. The BPF part makes all the
scheduling decisions.
Tasks are categorized as either interactive or regular based on their average
rate of voluntary context switches per second. Tasks that exceed a specific
voluntary context switch threshold are classified as interactive. Interactive
tasks are prioritized in a higher-priority queue, while regular tasks are
placed in a lower-priority queue. Within each queue, tasks are sorted based on
their weighted runtime: tasks that have higher weight (priority) or use the CPU
for less time (smaller runtime) are scheduled sooner, due to their a higher
position in the queue.
Moreover, each task gets a time slice budget. When a task is dispatched, it
receives a time slice equivalent to the remaining unused portion of its
previously allocated time slice (with a minimum threshold applied). This gives
latency-sensitive workloads more chances to exceed their time slice when needed
to perform short bursts of CPU activity without being interrupted (i.e.,
real-time audio encoding / decoding workloads).
## Typical Use Case
Interactive workloads, such as gaming, live streaming, multimedia, real-time
audio encoding/decoding, especially when these workloads are running alongside
CPU-intensive background tasks.
In this scenario scx_bpfland ensures that interactive workloads maintain a high
level of responsiveness.
## Production Ready?
The scheduler is based on scx_rustland, implementing nearly the same scheduling
algorithm with minor changes and optimizations to be fully implemented in BPF.
Given that the scx_rustland scheduling algorithm has been extensively tested,
this scheduler can be considered ready for production use.