622b61dd2f
The scx_rusty scheduler does not support hotplug, and expects a static host topology throughout its runtime. Though the kernel does have support for detecting hotplug events, we currently don't detect this in the kernel, nor surface it to user space when it happens. Now that we have scx_bpf_exit(), we can gracefully exit the kernel in the event of a hotplug, and communicate to user space that it should restart the scheduler. This patch adds that support to scx_rusty. Note that this assumes that we're running on a recent enough kernel that has scx_bpf_exit(). If it doesn't, then we instead just error out of the kernel scheduler and exit the application. Signed-off-by: David Vernet <void@manifault.com> |
||
---|---|---|
.. | ||
src | ||
.gitignore | ||
build.rs | ||
Cargo.toml | ||
LICENSE | ||
meson.build | ||
README.md | ||
rustfmt.toml |
scx_rusty
This is a single user-defined scheduler used within sched_ext, which is a Linux kernel feature which enables implementing kernel thread schedulers in BPF and dynamically loading them. Read more about sched_ext.
Overview
A multi-domain, BPF / user space hybrid scheduler. The BPF portion of the scheduler does a simple round robin in each domain, and the user space portion (written in Rust) calculates the load factor of each domain, and informs BPF of how tasks should be load balanced accordingly.
How To Install
Available as a Rust crate: cargo add scx_rusty
Typical Use Case
Rusty is designed to be flexible, and accommodate different architectures and workloads. Various load balancing thresholds (e.g. greediness, frequenty, etc), as well as how Rusty should partition the system into scheduling domains, can be tuned to achieve the optimal configuration for any given system or workload.
Production Ready?
Yes. If tuned correctly, rusty should be performant across various CPU architectures and workloads. Rusty by default creates a separate scheduling domain per-LLC, so its default configuration may be performant as well. Note however that scx_rusty does not yet disambiguate between LLCs in different NUMA nodes, so it may perform better on multi-CCX machines where all the LLCs share the same socket, as opposed to multi-socket machines.
Note as well that you may run into an issue with infeasible weights, where a task with a very high weight may cause the scheduler to incorrectly leave cores idle because it thinks they're necessary to accommodate the compute for a single task. This can also happen in CFS, and should soon be addressed for scx_rusty.