Introduce scx_flash (Fair Latency-Aware ScHeduler), a scheduler that
focuses on ensuring fairness among tasks and performance predictability.
This scheduler is introduced as a replacement of the "lowlatency" mode
in scx_bpfland, that has been dropped in commit 78101e4 ("scx_bpfland:
drop lowlatency mode and the priority DSQ").
scx_flash operates based on an EDF (Earliest Deadline First) policy,
where each task is assigned a latency weight. This weight is adjusted
dynamically, influenced by the task's static weight and how often it
releases the CPU before its full assigned time slice is used: tasks that
release the CPU early receive a higher latency weight, granting them
a higher priority over tasks that fully use their time slice.
The combination of dynamic latency weights and EDF scheduling ensures
responsive and stable performance, even in overcommitted systems, making
the scheduler particularly well-suited for latency-sensitive workloads,
such as multimedia or real-time audio processing.
Tested-by: Peter Jung <ptr1337@cachyos.org>
Tested-by: Piotr Gorski <piotrgorski@cachyos.org>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
* Fix a couple of misc errors in build scripts.
* Tweak scripts/kconfigs to make bpftrace work.
* Update how CI caching works to make builds faster (6 minute turnaround
time)
* Update CI config to generate per-scheduler debug archives w/ guest
dmesg/scheduler stdout, guest stdout, bpftrace script output,
veristat output.
* Update build scripts to accept the following:
** VNG RW -- write to host filesystem (better caching, logging).
* For stress tests in particular (via ini config):
** QEMU Opts -- to facilitate reproducing bugs (i.e. high core count).
** bpftrace scripts -- specify bpftrace scripts to run during stress
tests.
* enable bpftrace when using stress tests
update meson/stress test runner to enable
running bpftrace scripts while running
stress tests.
* disable layered stats output on ci
Rust build was using two separate workspaces - rust/ and scheds/rust.
There's no reason to separate them and it makes doc generation tricky. Use
single top level workspace so that we can drive all rust building from
cargo.
split build and test jobs to reduce ci turnaround time
and make it clear what is failing when something fails.
also add virtiofsd to deps to make test compilation faster
(most test time is compliation) and remove all force 9ps.
Recently, we have triggered some OOM conditions during stress tests,
particularly with the user-space schedulers. To avoid this issue and
prevent false positives, increase the memory size of the virtme-ng
instance from the default 1GB to 2GB.
Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
meson build script was building each rust sub-project under rust/ and
scheds/rust/ separately. This means that each rust project is built
independently which leads to a couple problems - 1. There are a lot of
shared dependencies but they have to be built over and over again for each
proejct. 2. Concurrency management becomes sad - we either have to unleash
multiple cargo builds at the same time possibly thrashing the system or
build one by one.
We've been trying to solve this from meson side in vain. Thankfully, in
issue #546, @vimproved suggested using cargo workspace which makes the
sub-projects share the same target directory and built together by the same
cargo instance while still allowing each project to behave independently for
development and publishing purposes.
Make the following changes:
- Create two cargo workspaces - one under rust/, the other under
scheds/rust/. Each contains all rust projects underneath it.
- Don't let meson descend into rust/. These are libraries used by the rust
schedulers. No need to build them from meson. Cargo will build them as
needed.
- Change the rust_scheds build target to invoke `cargo build` in
scheds/rust/ and let cargo do its thing.
- Remove per-scheduler meson.build files and instead generate custom_targets
in scheds/rust/meson.build which invokes `cargo build -p $SCHED`.
- This changes rust binary directory. Update README and
meson-scripts/install_rust_user_scheds accordingly.
- Remove per-scheduler Cargo.lock as scheds/rust/Cargo.lock is shared by all
schedulers now.
- Unify .gitignore handling.
The followings are build times on Ryzen 3975W:
Before:
________________________________________________________
Executed in 165.93 secs fish external
usr time 40.55 mins 2.71 millis 40.55 mins
sys time 3.34 mins 36.40 millis 3.34 mins
After:
________________________________________________________
Executed in 36.04 secs fish external
usr time 336.42 secs 0.00 millis 336.42 secs
sys time 36.65 secs 43.95 millis 36.61 secs
Wallclock time is reduced 5x and CPU time 7x.
scx_layered now can be run with a single command when `--run-example` is
specified. Update test_sched script to support per-sched arguments and
enable it for scx_layered.
Add a github action to run
[`veristat`](https://github.com/libbpf/veristat) during PRs and merges.
This uses a [action cache](https://github.com/actions/cache) to store
the veristat values when a PR is merged.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
Add a meson script to run veristat. This can later be used to generate
reports for BPF program complexity at PR time.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
When fetching external git repositories (libbpf and bpftool) we don't
check if the target commit exists.
This can leads to issues such as #400, because we may silently use HEAD,
instead of the specified commit.
Prevent this by returning an error when the target SHA1 cannot be found.
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
If LLVM is compiled with the LLVM_VERSION_SUFFIX cmake option, then the
version may have an additional suffix, for example "18.1.7+libcxx".
Gentoo for example uses this to fend off ABI issues between libstdc++
and libc++.
Signed-off-by: Violet Purcell <vimproved@inventati.org>
This change adds stress-ng as a load test for schedulers when running in
CI. It will run stress-ng while schedulers are being tested with a
reasonable amount of work. At the end of the run the stress-ng metrics
are collected for later analysis. However, since these results may be
running in a VM they may not be super robust.
When CDPATH is set the fetch_libbpf build script will cd into
the preferred CDPATH directory. This change removes the CDPATH
environment variable so any preferred CDPATH paths are ignored.
The issue can be reproduced with the following steps:
1) mkdir -p /tmp/libbpf
2) CDPATH=/tmp/ meson setup build --prefix /tmp
The build should fail at the fetch_libbpf step.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
Build fails with sccache.
- Update meson-scripts/build_bpftool to support sccache. Unfortunately, this
isn't enough.
- Update README to warn about sccache and add the instruction to disable it
for buliding scx.
- Also add ⚠️ to make boot loader update step more prominent in arch
installation instruction.
get_clang_ver fails if clang is built from scratch.
Teach get_clang_ver to recognize the clang version even for clang built
from git.
These are the tests I ran:
# /usr/local/bin/clang --version
clang version 18.0.0git (https://github.com/llvm/llvm-project.git c458f928fad7bbcf08ab1da9949eb2969fc9f89c)
# meson-scripts/get_clang_ver /usr/local/bin/clang
18.0.0
# /usr/bin/clang --version
clang version 17.0.6 (CentOS 17.0.6-5.el9)
# meson-scripts/get_clang_ver /usr/bin/clang
17.0.6
Signed-off-by: Breno Leitao <leitao@debian.org>
Instead clone the libbpf repo at a specific hash during setup.
This is to fix an issue whereby submodules are not included
in the tarball and therefore won't be updated/fetched during
setup after unzipping the tarball.
Otherwise, we end up passing CC=ccache to libbpf's Makefile which triggers
an error as ccache invoked on its own can't act as a stand-in for the
compiler.
This is to potentinally reduce issues with folks
using different versions of libbpf at runtime.
This also:
- makes static linking of libbpf the default
- adds steps in `meson setup` to fetch libbpf and make it
These two schedulers are provided mostly as examples / PoC, so we should
exclude them from our periodic testing, to prevent triggering false
positives in our CI.
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Search for potential errors only in the kernel logs and the scheduler
stderr.
In this way we can use "error keywords" in the scheduler's output
without triggering false positives in the CI (see for example #127).
NOTE: this works, because virtme-ng, when executed in verbose mode,
sends the kernel messages to stderr (together with the command's stderr)
and it channels the command's stdout to the stdout of the host.
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Use virtme-ng to run the schedulers after they're built; virtme-ng
allows to pick an arbitrary sched-ext enabled kernel and run it
virtualizing the entire user-space root filesystem, so we can basically
exceute the recompiled schedulers inside such kernel.
This should allow to catch potential run-time issue in advance (both in
the kernel and the schedulers).
The sched-ext kernel is taken from the Ubuntu ppa (ppa:arighi/sched-ext)
at the moment, since it is the easiest / fastest way to get a
precompiled sched-ext kernel to run inside the Ubuntu 22.04 testing
environment.
The schedulers are tested using the new meson target "test_sched", the
specific actions are defined in meson-scripts/test_sched.
By default each test has a timeout of 30 sec, after the virtme-ng
completes the boot (that should be enough to initialize the scheduler
and run the scheduler for some seconds), while the total lifetime of the
virtme-ng guest is set to 60 sec, after this time the guest will be
killed (this allows to catch potential kernel crashes / hangs).
If a single scheduler fails the test, the entire "test_sched" action
will be interrupted and the overall test result will be considered a
failure.
At the moment scx_layered is excluded from the tests, because it
requires a special configuration (we should probably pre-generate a
default config in the workflow actions and change the scheduler to use
the default config if it's executed without any argument).
Moreover, scx_flatcg is also temporarily excluded from the tests,
because of these known issues:
- https://github.com/sched-ext/scx/issues/49
- https://github.com/sched-ext/sched_ext/pull/101
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
- combine c and kernel-examples as it's confusing to have both
- rename 'rust-user' and 'c-user' to just 'rust' and 'c', which is simpler
- update and fix sync-to-kernel.sh
Some distro may add their own prefix to the version string of clang, for
example in Ubuntu:
$ clang --version
Ubuntu clang version 17.0.5 (1ubuntu1)
...
That triggers the following meson error during the setup phase:
meson.build:25:44: ERROR: String '' cannot be converted to int
Change the regexp used to evaluate the clang version to avoid this
build failure.
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>